summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* tools: Support --quiet option for oping serverDimitri Staessens2022-03-302-2/+6
| | | | | | | | The oping server will not print receiving packets when the --quiet (-Q) flag is passed, like the client. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Don't log prefix in syslogDimitri Staessens2022-03-301-2/+1
| | | | | | | | | | | Logging the prefix in the system logs has a lot of duplication, as the syslog already includes the name of the daemon. Maybe we should deprecate logging to stdout and focus on the syslog, revise things a bit to print internal component names if they are defined. For now, I think it's less annoying to read the syslog without the prefixes. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Non-configurable delayed acks in FRCPDimitri Staessens2022-03-304-30/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It doesn't really make sense to manually and one-sidedly configure the timeout of delayed acknowledgements, as setting it too high upsets the peer's sRTT estimates. Even worse, it also causes a lot of spurious retransmissions if it exceeds the sRTT mean deviation calculated by the receiver. Compensating on bare acknowledgment for the ack delay could improve the RTT estimate deviation, but not the spurious retransmissions if it was set too high. This sets the delayed ack to wait for a single RTT mean deviation. Probably needs more tweaking to further reduce differences between the RTT estimates at the sender and receiver, e.g. compensate the RTT estimate for delayed acks, or increase the RTO to add 8 mdevs to sRTT instead of 4. However, it looks like the mdev estimate is the trickiest one to get to sync, not the RTT average. Linux reduces the sample weight for mdev from 1/4 to 1/32 in some cases, will give that a shot some day too to see if that further align sRTT estimates. In any case, this patch already improves things a lot. Also fixes a bug where the sender was sending acknowlegments on the first packets in flight for the 0 sequence number. The receiver activity was measured in seconds but compared to a timeout value in nanoseconds. There's still a lot of spurious retransmissions that start after actual packet loss occurs, I'm still investigating what causes it. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Expose flow control metrics to RIB0.19.1Dimitri Staessens2022-03-163-15/+49
| | | | | | | | | | This exposes some additional metrics relating to FRCT / Flow control: the number of duplicate packets received, number of packets received out of the flow control window and / or reordering queue, and the number of rendez-vous messages sent. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix retransmission schedulingDimitri Staessens2022-03-161-72/+60
| | | | | | | | | | | | | There still were a couple of bugs in the timerwheel. If the future schedule was coinciding with the slot currently being processed (i.e. exactly RXMQ_SLOTS in the future), the list_add_tail caused an infinite loop. Another bug was causing the slots at higher levels to be processed too soon. Retransmissions should now schedule correctly. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix buffer allocation when retransmitting0.19.0Dimitri Staessens2022-03-116-20/+52
| | | | | | | | | | | | | | | The timerwheel was retransmitting packets and the error check for negative values of the rbuff allocation was instead checking for non-zero values, causing a buffer allocation to succeed but the program to continue down the unhappy path leaving that packet stuck in the buffer unattended. Also fixes wrongly scheduled retransmissions that cause packet storms. FRCP is much more stable now. Still needs some work for high bandwidth-delay products (fast-retransmit). Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix memcpy with NULL in piggyback APIDimitri Staessens2022-03-084-9/+17
| | | | | | | | If there is no piggyback data, memcpy was passed a NULL pointer in memcpy(buf, NULL, 0) calls, which is undefined behaviour. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Refactor kad_req_createDimitri Staessens2022-03-081-20/+25
| | | | | | | A small refactor of the kad_req_create function's cleanup code. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* build: Add debug option for fsanitize=undefinedDimitri Staessens2022-03-081-1/+3
| | | | | | | | The sanitizer for undefined behaviour can now be enabled using DebugUSan build option for convenience. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipdpd: Pass MPL to application at flow_allocationDimitri Staessens2022-03-0818-25/+91
| | | | | | | | | | | | The maximum packet lifetime (MPL) is a property of the flow that needs to be passed to the reliable transmission protocol (FRCP) for its correct operation. Previously, the value of MPL was set fixed as one of the (fixed) Delta-t parameters. This patch makes the MPL a property of the layer, and it can now be set per layer-type at build time. This is a step towards a proper MPL estimator in the flow allocator. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Pass Delta-t params to frcti_create()Dimitri Staessens2022-03-082-9/+9
| | | | | | | | | The parameters were set directly from the build configs. A first step to making FRCP configurable at runtime, is to pass the parameters to the frcti_create() function. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix RTT estimator invocation in FRCTDimitri Staessens2022-03-031-1/+1
| | | | | | | The notorious off-by-one hit again. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix underflow in keepalive timerDimitri Staessens2022-03-031-1/+1
| | | | | | | If the keepalive would underflow if set to 1-3 ms. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Flag all flows down as the IRMd exitsDimitri Staessens2022-03-034-14/+28
| | | | | | | | | | | | | | On exit of the IRMd all flows will now be flagged as down, so external applications will not hang anymore. Note: reads keep work from flows that are down until there are no more remaining packets in the buffer, but no more packets can be written. When the RIB is used, the external application may exit a bit later than the IRMd, so I added a brief sleep before the IRMd tries to remove the fuse main directory. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix lock reversal in timerwheelDimitri Staessens2022-03-031-5/+0
| | | | | | | | | There was a lock reversal in the timerwheel. There still is a thorough revision needed of the locking in dev.c after the FRCP logic is completed and tuned. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* irmd, ipcp: Remove socket option in acceptloopDimitri Staessens2022-03-032-14/+3
| | | | | | | | | | | | | We cancel the thread, so the SO_RCVTIMEO is not needed anymore (it dated from when we checked the state every so often. The address sanitizer is complaining about the the cleanup handlers in the acceptloops after the thread gets cancelled in the read(). I've tried to resolve it, but no avail. Pretty convinced it's a false-positive, so ASan will ignore these functions for now. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Cleanup RIB mount nameDimitri Staessens2022-03-031-1/+3
| | | | | | | | | | IPCPs would call rib_fini() twice, once after cleaning up their managed RIB, and once again for the program-generic RIB, which is not initialized for IPCPs. rib_fini() checked if the mount name was valid, but it didn't unset it after execution. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Check return value of rib_initDimitri Staessens2022-03-031-1/+6
| | | | | | | The rib_init return value wasn't checked. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Encrypt bare FRCP messages on encrypted flowsDimitri Staessens2022-03-033-44/+29
| | | | | | | | Bare FRCP messages (ACKs without data, Rendez-vous packets) were not encrypted on encrypted flows, causing the receiver to fail decryption. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Make flow liveness timeout configurableDimitri Staessens2022-03-0311-53/+111
| | | | | | | | | The qosspec_t now has a timeout value that sets the timeout value of the flow. Flows with a peer that has timed out will now return -EFLOWPEER on flow_read() or flow_write(). Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* tools: Fix return value on error in ocbrDimitri Staessens2022-03-031-3/+8
| | | | | | | The ocbr tool was returning 0 on error. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* tools: Fix signed/unsigned mismatch in irm_enrollDimitri Staessens2022-03-031-31/+30
| | | | | | | The irm_list_ipcps function can return negative values. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Remove dead code in timerwheelDimitri Staessens2022-03-031-6/+0
| | | | | | | The checked condition can't happen. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* irmd: Fix memory leak of ret_msgDimitri Staessens2022-03-032-2/+2
| | | | | | | | The ret_msg variable can leak in the main loop of the irmd in this failure path. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix some unchecked return valuesDimitri Staessens2022-03-034-24/+45
| | | | | | | Fixes some unchecked and wrongly checked return values. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Add initial flow liveness monitoringDimitri Staessens2022-02-243-23/+163
| | | | | | | | | | | | | | | | | | This adds flow liveness monitoring for flows, with a fixed timeout of 120s. I will make it configurable at flow allocation later on (timeout needs to be communicated to the peer). If one peer dies, or doesn't call any IPC calls (flow_write/flow_read/fevent) it will stop sending keepalives and the other peer's read/writes will error on an -EFLOWDOWN after the timeout expires. Packets without a payload (0 length packets) are interpreted as keepalive packets for the flow. They can be sent from any application, but they will not trigger a message read at the receiver side (0 as a return value on flow_read indicates a previous partial read has completed at exactly the buffer size). Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Maintain a list of flows in flow_setDimitri Staessens2022-02-241-26/+99
| | | | | | | | | The flow_set will now keep a list of the flows in the set, this makes it more efficient to iterate over the flows. Extending the public API for fset_t with an iterator will also be useful. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix potential double unlock in ethDimitri Staessens2022-02-211-23/+15
| | | | | | | | | When handling management frames, there was a cancellation point after the unlock, which would cause the cleanup handler to attempt a double unlock if the thread was cancelled at that point. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Handle FLOWDOWN during blocking readDimitri Staessens2022-02-211-3/+7
| | | | | | | | | | The blocking read from the rbuff was not correctly handling flow down states, returning a valid index. The attempt to fetch the header then failed on an assertion. The blocking read will now return -EFLOWDOWN if the flow is marked down by the IPCP. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Refactor sendingflow allocation responseDimitri Staessens2022-02-211-21/+33
| | | | | | | | Small refactor taking the wait for the flow allocation to complete out. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Refactor flow allocator message handlingDimitri Staessens2022-02-211-113/+170
| | | | | | | | This refactors the single long function that handles incoming packets destined for the flow allocator. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix hashing and overlapping memcpy in pffDimitri Staessens2022-02-182-12/+4
| | | | | | | | | The pft hash function assumed mem_hash allocates memory, but it does not. There was also a memcpy with potentially overlapping memory regions, which is undefined behaviour. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix deadlock in dht_unregDimitri Staessens2022-02-181-8/+3
| | | | | | | | | The dht_del function was called under lock in dht_unreg, and then tried to take the lock again, a 100% deadlock. Also fix uninitialized value in dht_retrieve. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Use random buffer for flat addressDimitri Staessens2022-02-181-15/+4
| | | | | | | Less code, and less chance of a collision. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix potential overflow in DDNS resolverDimitri Staessens2022-02-181-1/+1
| | | | | | | The count value could be IPCP_UDP_BUF_SIZE, overflowing buf. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Remove unused wait_state functionDimitri Staessens2022-02-182-34/+0
| | | | | | | Probably a leftover from previous shutdown logic. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Check return value of enroll_packDimitri Staessens2022-02-181-2/+2
| | | | | | | Better to check the error code than the out parameter. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix bounds check on PROG_MAX_FLOWSDimitri Staessens2022-02-181-2/+2
| | | | | | | Off-by-one error in the bounds check. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* irmd: Fix race condition in sanitize threadDimitri Staessens2022-02-181-4/+5
| | | | | | | | | Unlocking the flows while iterating could cause a modification during the iteration. Added pthread_cleanup handlers as the thread could get cancelled while holding a lock. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix initialization of hash width in DHTDimitri Staessens2022-02-171-5/+7
| | | | | | | | | | The width of the Kademlia hash function (dht->b) was set only after the ID was created. This should have failed miserably, but the bytes after were fine as they were just a randomized ID in the Kademlia network. Nasty. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* irmd: Fix argvdup util functionDimitri Staessens2022-02-171-10/+12
| | | | | | | | The argvdup function didn't handle the case where argc is 0. Small refactor that also handles this case correctly. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix free in fail path of readdirDimitri Staessens2022-02-174-3/+12
| | | | | | | | | The free of the buffer in the failure path of the readdir RIB functions was taking the wrong pointer in a couple of places. The FRCT RIB readdir was missing error handling for malloc and strdup. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Don't use pointer to set FRCT flagsDimitri Staessens2021-12-293-13/+18
| | | | | | | | | | | | | | | | | | The fccntl call FRCTSFLAGS was using a pointer to a flags so set flags, which should just be a regular uint16_t. For instance, the FRCTLINGER flags can now be turned off using fccntl(fd, FRCTSFLAGS, FRCTFRESCNTL | FRCTFRTX) leaving only resource control (flow control, FRCTFRESCNTL) and retransmission enabled. Note that retransmission (FRCTFRTX) can't be enabled or disabled on a live flow, it will be set on flow allocation. Updates the man page for fccntl to add these FRCT options. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Use wrlock for rotating multipath pff entryDimitri Staessens2021-12-291-12/+14
| | | | | | | | The multipath pff entry was modified (rotated) under a read lock, which is now changed to a write lock. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Allow creation of multiple directoriesDimitri Staessens2021-12-295-421/+523
| | | | | | | | | To allow merging large network layers, a situation will arise where multiple directories need to coexist within the layer. This reverts commit 9422e6be94ac1007e8115a920379fd545055e531. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Ease lock in timerwheel0.18.4Dimitri Staessens2021-12-222-3/+3
| | | | | | | It was taking a write lock when a read lock was sufficient. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix waiting for FRCT at deallocationDimitri Staessens2021-12-221-6/+6
| | | | | | | | | | This is a fix to wait for outstanding retransmissions when a flow is deallocated. Instead of waiting the full timeout, it will now wait in the same tic increments used within FRCT. Bit of a stopgap at the moment, FRCT and the flows are in need of a serious refactor. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Add missing rwlock unlock in FRCTDimitri Staessens2021-12-221-2/+4
| | | | | | | There was a missing unlock in FRCT. Also fixes some indentation. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Fix waiting for FRCP to time outDimitri Staessens2021-12-222-2/+3
| | | | | | | | | | The timeout variable was not correctly passed to the IPCP, causing flow IDs to be reused immediately instead of waiting for the full Delta-t to expire. This caused all kinds of havoc with retransmissions in reliable flows. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix flow dealloc after expired FRCT timeoutDimitri Staessens2021-12-221-0/+1
| | | | | | | | | | If the timeout is already expired, the wait variable would be negative and return a negative value for the __frcti_dealloc function, thinking that the timeout was not expired causing an unnecessary wait even if all packets are acknowledged. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>