summaryrefslogtreecommitdiff
path: root/src/lib
Commit message (Collapse)AuthorAgeFilesLines
* lib: Fix waiting for FRCT at deallocationDimitri Staessens2021-12-221-6/+6
| | | | | | | | | | This is a fix to wait for outstanding retransmissions when a flow is deallocated. Instead of waiting the full timeout, it will now wait in the same tic increments used within FRCT. Bit of a stopgap at the moment, FRCT and the flows are in need of a serious refactor. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Add missing rwlock unlock in FRCTDimitri Staessens2021-12-221-2/+4
| | | | | | | There was a missing unlock in FRCT. Also fixes some indentation. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix flow dealloc after expired FRCT timeoutDimitri Staessens2021-12-221-0/+1
| | | | | | | | | | If the timeout is already expired, the wait variable would be negative and return a negative value for the __frcti_dealloc function, thinking that the timeout was not expired causing an unnecessary wait even if all packets are acknowledged. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Set initial sender rwe to sender seqnoDimitri Staessens2021-12-221-1/+1
| | | | | | | | | | | The initial sender right window edge (indicating acknowledged packet sequence number) was initialized to seqno - 1. This should be the same as seqno, since we acknowledge with the next expected sequence number. It also indicates that a flow without traffic has no outstanding acknowledgements. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Remove old rib_fini codeDimitri Staessens2021-12-061-3/+0
| | | | | | | | There was some leftover code in dev.c wrt to the process RIB that is not needed anymore. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix undefined behaviour in sha3Dimitri Staessens2021-12-061-2/+1
| | | | | | | | | Arithmetic with NULL pointers is undefined behaviour. Caught by clang 13. Fixed by using uintptr_t, which is guaranteed to be the size of a pointer. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Don't initialize process RIB for IPCPsDimitri Staessens2021-07-102-10/+5
| | | | | | | | | | | | | This will skip rib_init() at __init() for IPCPs (or at least, processes that have "ipcpd" in the executable name). The previous code tried to unmount the generic mount and then remount under the ipcp name, but it often failed because fuse_mount() is asynchronous and the mount was not up at the time of the unmount() call. Renaming the mount instead of unmounting failed for the same reason. This is a better fix for now. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Application RIB with FRCT statisticsDimitri Staessens2021-06-305-17/+173
| | | | | | | | | | Application flows can now be monitored from the RIB, exposing FRCT statistics (window edges, retransmission timeout, rtt estimate, etc). Application RIB requires user permissions to be able to access /dev/fuse. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Pass full path for RIB entriesDimitri Staessens2021-06-291-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | The read functions for the RIB will now receive the full path, instead of only the entry name. For IPCPs, we organized the RIB in an /<ipcp>/<component>/entries structure with a directory per component, so we don't need the full path at this point. For process flow information, it's a lot more convenient to organize it the following way /<pid>/<fd>/stat We can then register/unregister the flow descriptor when the frct instance is created, and for getting the stats, we'd know the flow descriptor from the fuse file path. If we would create a file per flow instead of a directory per flow, something like /<pid>/flows/<fd> we'd need to do additional bookkeeping to list the contents of that directory (we would need to track all flows with an active FRCT instance), that fuse knows because it tracks the directories. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Remove struct stat from RIB APIDimitri Staessens2021-06-281-5/+14
| | | | | | | | | | The RIB API had a struct stat in the getattr() function, which made all components that exposed variables via the RIB dependent on <sys/stat.h>. The rib now has its own struct rib_attr to set attributes such as size and last modified time. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* build: Fix compilation with fuse (RIB) on FreeBSDDimitri Staessens2021-06-281-4/+5
| | | | | | | | | | | Compilation failed on FreeBSD 14 with fuse enabled because of some missing definitions. __XSI_VISIBLE must be set before including <ouroboros/rib.h> for some definitions in <sys/stat.h>. FreeBSD doesn't know the MSG_CONFIRM flag to sendto() or CLOCK_REALTIME_COARSE, which are Linux-specific. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib, ipcpd, irmd: Wrap pthread unlocks for cleanupDimitri Staessens2021-06-2310-32/+23
| | | | | | | | | | | | This add an ouroboros/pthread.h header that wraps the pthread_..._unlock() functions for cleanup using pthread_cleanup_push() as this casting is not safe (and there were definitely bad casts in the code). The close() function is now also wrapped for cleanup in ouroboros/sockets.h. This allows enabling more compiler checks. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Bypass assertion in shm_rdrbuff0.18.1Dimitri Staessens2021-06-211-1/+1
| | | | | | | | | This assert() causes ipcpd and subsequent irmd abort() when shutting down debug builds. Should be fixed some day when other components are more robust (frct retransmissions and routing). Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Move RIB initialization to common groundDimitri Staessens2021-06-211-0/+6
| | | | | | | | | | | | | | This moves Resource Information Base (RIB) initialization into the ipcp_init() function, so all IPCPs initialize a RIB. The RIB not shows some common IPCP information, such as the IPCP name, IPCP state and the layer name if the IPCP is part of a layer. The initialization of the hash algorithm and layer name was moved out of the common ipcp source because IPCPs may only know this information after enrollment. Some IPCPs were not even storing this information. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* build: Remove raptor IPCPDimitri Staessens2021-03-281-1/+1
| | | | | | | | | | This removes the raptor IPCP. The code hasn't been updated for a while, and wouldn't compile. Raptor served its purpose as a PoC for Ouroboros-over-Ethernet-Layer-1, but giving the extreme niche hardware needed to run it, it's not worth maintaining this anymore. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Single UDP port for the ipcpd-udp0.18.0Dimitri Staessens2021-01-032-8/+5
| | | | | | | | | | | | | | | | | | The UDP layer will now use a single (configurable) UDP port, default 3435. This makes it easer to allocate flows as a client from behind a NAT firewall without having to configure port forwarding rules. So basically, from now on Ouroboros traffic is transported over a bidirectional <src><port>:<dst><port> UDP tunnel. The reason for not using/allowing different client/server ports is that it would require reading from different sockets using select() or something similar, but since we need the EID anyway (mgmt packets arrive on the same server UDP port), there's not a lot of benefit in doing it. Now the operation is similar to the ipcpd-eth, with the port somewhat functioning as a "layer name", where in UDP, the Ethertype functions as a "layer name". Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* build: Update email addressesDimitri Staessens2021-01-0339-78/+78
| | | | | | | | The ugent email addresses are shut down, updated to Ouroboros mail addresses. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* build: Update copyright to 2021Dimitri Staessens2021-01-0340-40/+40
| | | | | | | Happy New Year, Ouroboros! Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix flow_accept without opensslDimitri Staessens2020-12-122-5/+7
| | | | | | | | DH key creation was returning -ECRYPT if opennssl is not installed, instead of success (0). Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix return value in function returning voidDimitri Staessens2020-12-121-1/+1
| | | | | | | This causes builds to fail on systems where OpenSSL is not available. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Add congestion avoidance policiesDimitri Staessens2020-12-023-6/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds congestion avoidance policies to the unicast IPCP. The default policy is a multi-bit explicit congestion avoidance algorithm based on data-center TCP congestion avoidance (DCTCP) to relay information about the maximum queue depth that packets experienced to the receiver. There's also a "nop" policy to disable congestion avoidance for testing and benchmarking purposes. The (initial) API for congestion avoidance policies is: void * (* ctx_create)(void); void (* ctx_destroy)(void * ctx); These calls create / and or destroy a context for congestion control for a specific flow. Thread-safety of the context is the responsability of the flow allocator (operations on the ctx should be performed under a lock). ca_wnd_t (* ctx_update_snd)(void * ctx, size_t len); This is the sender call to update the context, and should be called for every packet that is sent on the flow. The len parameter in this API is the packet length, which allows calculating the bandwidth. It returns an opaque union type that is used for the call to check/wait if the congestion window is open or closed (and allowing to release locks before waiting). bool (* ctx_update_rcv)(void * ctx, size_t len, uint8_t ecn, uint16_t * ece); This is the call to update the flow congestion context on the receiver side. It should be called for every received packet. It gets the ecn value from the packet and its length, and returns the ECE (explicit congestion experienced) value to be sent to the sender in case of congestion. The boolean returned signals whether or not a congestion update needs to be sent. void (* ctx_update_ece)(void * ctx, uint16_t ece); This is the call for the sending side top update the context when it receives an ECE update from the receiver. void (* wnd_wait)(ca_wnd_t wnd); This is a (blocking) call that waits for the congestion window to clear. It should be stateless (to avoid waiting under locks). This may change later on if passing the context is needed for different algorithms. uint8_t (* calc_ecn)(int fd, size_t len); This is the call that intermediate IPCPs(routers) should use to update the ECN field on passing packets. The multi-bit ECN policy bases the value for the ECN field on the depth of the rbuff queue packets will be sent on. I created another call to grab the queue depth as fccntl is write-locking the application. We can further optimize this to avoid most locking on the rbuff. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Reduce timerwheel CPU consumptionDimitri Staessens2020-11-252-1/+10
| | | | | | | | | | | The timerwheel is checked during IPC calls (fevent, flow_read), causing huge load on CPU consumption in IPCPs, since they have a lot of fevent() threads for QoS. The timerwheel will need further optimization), but for now I reduced the default tick time to 5 ms and added a boolean to check that the wheel is actually used. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Don't default to lockless rbuffDimitri Staessens2020-11-221-1/+1
| | | | | | | | I mistakenly set the default to the (buggy) lockless rbuff implementation instead of the pthread one in commit 3aec660e. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* Revert "lib: Unmount stale RIB directories"Sander Vrijders2020-11-111-10/+1
| | | | | | | | This reverts commit 978266fe4beba21292daad2d341fe5ff22e08aba. We were incorrectly unmounting the directory under normal conditions. Signed-off-by: Sander Vrijders <[email protected]> Signed-off-by: Dimitri Staessens <[email protected]>
* lib: Add Rendez-Vous mechanism for flow controlDimitri Staessens2020-10-113-31/+130
| | | | | | | | | | | | | This adds the rendez-vous mechanism to handle the case where the sending window is closed and window updates get lost. If the sending window is closed, the sender side will send an RDVS every DELT_RDV time (100ms), and give up after MAX_RDV time (1 second). Upon reception of a RDVS packet, a window update is sent immediately. We can make this much more configurable later on (build options for defaults, fccntl for runtime tuning). Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Block on closed flow control windowDimitri Staessens2020-10-112-18/+129
| | | | | | | | | | | If the sending window for flow control is closed, the sending application will now block until the window opens. Beware that until the rendez-vous mechanism is implemented, shutting down a server while the client is sending (with non-timed-out blocking write) will cause the client to hang indefinitely because its window will close. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Refactor flow_writeDimitri Staessens2020-10-111-20/+11
| | | | | | | Refactor flow_write cleanup. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Send and receive window updatesDimitri Staessens2020-10-113-9/+38
| | | | | | | | | This adds sending and receiving window updates for flow control. I used the 8 pad bits as part of the window update field, so it's 24 bits, allowing for ~16 million packets in flight. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Add compiler configuration for FRCPDimitri Staessens2020-10-115-93/+153
| | | | | | | | | | | | | This allows configuring some parameters for FRCP at compile time, such as default values for Delta-t and configuration of the timerwheel. The timerwheel will now reschedule when it fails to create a packet, instead of setting the flow down immediately. Some new things added are options to store packets for retransmission on the heap, and using non-blocking calls for retransmission. The defaults do not change the current behaviour. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix locking for FRCTDimitri Staessens2020-09-261-2/+4
| | | | | | | | Flows should be locked when moving the timerwheel. For frcti_snd, a rdlock is enough. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Complete retransmission logicDimitri Staessens2020-09-255-383/+667
| | | | | | | | | | | | | | | | | | | | | | | | | | | This completes the retransmission (automated repeat-request, ARQ) logic, sending (delayed) ACK messages when needed. On deallocation, flows will ACK try to retransmit any remaining unacknowledged messages (unless the FRCTFLINGER flag is turned off; this is on by default). Applications can safely shut down as soon as everything is ACK'd (i.e. the current Delta-t run is done). The activity timeout is now passed to the IPCP for it to sleep before completing deallocation (and releasing the flow_id). That should be moved to the IRMd in due time. The timerwheel is revised to be multi-level to reduce memory consumption. The resolution bumps by a factor of 1 << RXMQ_BUMP (16) and each level has RXMQ_SLOTS (1 << 8) slots. The lowest level has a resolution of (1 << RXMQ_RES) (20) ns, which is roughly a millisecond. Currently, 3 levels are defined, so the largest delay we can schedule at each level is: Level 0: 256ms Level 1: 4s Level 2: about a minute. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Allow pure acknowledgment packets in FRCTDimitri Staessens2020-06-063-150/+297
| | | | | | | | | | | | | | | | | | | | This adds the logic to send a pure acknowledgment packet without any data to send. This needed the event filter for the fqueue, as these non-data packets should not trigger application PKT events. The default timeout is now 10ms, until we have FRCP tuning as part of fccntl. Karn's algorithm seems to be very unstable with low (sub-ms) RTT estimates. Doubling RTO (every RTO) seems still too slow to prevent rtx storms when the measured rtt suddenly spikes several orders of magnitude. Just assuming the ACK'd packet is the last one transmitted seems to be a lot more stable. It can lead to temporary underestimation, but this is not a throughput-killer in FRCP. Changes most time units to nanoseconds for faster computation. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Check rdrbuff sanitize for robust mutexesDimitri Staessens2020-05-291-0/+2
| | | | | | | | The sanitize function in the rdrbuff should only be compiled if robust mutexes are present on the system. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Refactor FRCTDimitri Staessens2020-05-042-72/+59
| | | | | | | | | | This is a small refactor of FRCT because I found some things a bit hard to read. I tried to refactor frcti_rcv to always queue the packet, but that causes unnecessarily retaking the lock when calling queued_pdu and thus returning idx is a tiny bit faster. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix retransmission DRF updateDimitri Staessens2020-05-021-4/+0
| | | | | | | | | | The retransmission was always disabling the DRF flag. This caused problems with the loss of the first packet, which of course needs a DRF flag set. The retransmitted packet will now contain a the original DRF flag and an updated ack number. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Create an rxmwheel per flowDimitri Staessens2020-05-024-128/+149
| | | | | | | | | The single retransmission wheel caused locking headaches as the calls for different flows could block on the same rxmwheel. This stabilizes the stack, but if the rdrbuff gets full there can now be big delays. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix updating retransmission wheelDimitri Staessens2020-05-014-23/+28
| | | | | | | | | | Fixes infinite rescheduling with RTO getting lower than the timerwheel resolution. For very low RTO values we'd need a big packet buffer with the current memory allocator implementation (rdrbuff). Setting a (configurable) minimum RTO (250 us) reduces this need. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Unmount stale RIB directoriesSander Vrijders2020-04-301-1/+10
| | | | | | | | If Ouroboros crashed, the RIB directory might still be mounted. This checks if this is the case, then unmounts it. Signed-off-by: Sander Vrijders <[email protected]> Signed-off-by: Dimitri Staessens <[email protected]>
* lib: Stabilize FRCP under packet loss conditions0.17.3Dimitri Staessens2020-04-302-59/+68
| | | | | | | | | | | There were a bunch of bugs in FRCP that urgently needed fixing. Now data QoS is usable even with heavy packet loss (within some parameters). The current RTT estimator is the IETF one. It should be updated to the improved one used in the Linux kernel once the A-timer (ACKs without data) and graceful shutdown are implemented. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* irmd: Fix cleanup of shm_flow_set0.17.1Dimitri Staessens2020-03-201-1/+1
| | | | | | | | | The shm_flowset destroy was using the irmd pid, resulting in wrong unlinks. The irmd was not cleaning up the process table, resulting in shm leaks if there were still running processes on exit. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix thread management in thread pool managerDimitri Staessens2020-03-201-1/+1
| | | | | | | | | The thread pool manager wasn't counting working threads when deciding to create new ones, resulting in constant starting of new threads when threads were busy. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Return number of written bytes on flow_write0.17.0Dimitri Staessens2020-03-151-3/+2
| | | | | | | | | This is more in line with the write() system call and prepares for partial writes. Partial writes are disabled by default (and not yet implemented). Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Change return type of fevent to ssize_tDimitri Staessens2020-03-151-3/+3
| | | | | | | | The return type was still an int, but since it returns the number of events, it should be an ssize_t. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* irm: Revise naming APIDimitri Staessens2020-03-152-21/+146
| | | | | | | | | | | | | | | | | | | | | | | This revises the naming API to treat names (or reg_name in the source) as first-class citizens of the architecture. This is more in line with the way they are described in the article. Operations have been added to create/destroy names independently of registering. This was previously done only as part of register, and there was no way to delete a name from the IRMd. The create call now allows specifying a policy for load-balancing incoming flows for a name. The default is the new round-robin load-balancer, the previous behaviour is still available as a spillover load-balancer. The register calls will still create a name if it doesn't exist, with the default round-robin load-balancer. The tools now have a "name" section, so the format is now irm name <operation> <name> ... Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix deadlock in threadpool managerDimitri Staessens2020-03-141-9/+23
| | | | | | | | There was a rare deadlock upon destruction of the threadpool manager because the threads were cancelled/joined under lock. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib, ipcpd: piggyback ECDHE on flow allocationDimitri Staessens2020-02-255-234/+233
| | | | | | | | | | | The initial implementation for the ECDHE key exchange was doing the key exchange after a flow was established. The public keys are now sent allowg on the flow allocation messages, so that an encrypted tunnel can be created within 1 RTT. The flow allocation steps had to be extended to pass the opaque data ('piggybacking'). Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Fix shm_rbuff testDimitri Staessens2020-02-161-0/+8
| | | | | | | | | The rbuff_destroy function asserts that we do not try to destroy an rbuff that still contains packets. The test now empties the rbuff before destroying it. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* ipcpd: Configure PFF from routing policyDimitri Staessens2020-02-162-9/+6
| | | | | | | | | | | | The Packet Forwarding Function (PFF) was user-configurable using the irm tool. However, this isn't really wanted since the PFF is dictated by the routing algorithm. This moves the responsability for selecting the correct PFF from the network admin to the unicast IPCP implementation. Each routing policy now has to specify which PFF it will use. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* lib: Move hashtable from lib to unicastDimitri Staessens2020-02-164-352/+0
| | | | | | | | | | The hashtable is only used for forwarding tables in the unicast IPCP. This moves the generic hashtable out of the library into the unicast IPCP to prepare a more tailored implementation specific to routing tables containing address lists. Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>
* build: Update copyright to 20200.16.0Dimitri Staessens2020-01-0242-42/+42
| | | | | Signed-off-by: Dimitri Staessens <[email protected]> Signed-off-by: Sander Vrijders <[email protected]>