| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
2022 was a rather slow year...
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
| |
Found by Clang version 15.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The dealloc call will now always do a non-blocking read before
attempting to destroy the rbuff, ensuring all keepalives are
processed.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
There was an unused struct timerwheel * lingering in the application
instance.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
| |
Growing pains.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If a flow was deallocated while there were still unprocessed events in
an fqueue, it would cause a SEGV in fqueue_next because it was not
checking the validity of the returned flow descriptor.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
The fqueues were relying on the fact that the portevent were two
integers. This cleans that up a bit.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
| |
The protobuf message was free'd before usage in flow_init.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The application will now handle incoming FRCT packets even if the
application never reads data from the flow (for instance servers). To
do this, it reserves an fset_t (id 0). When an FRCT-enabled flow is
created, it is automatically added to this fset. An rx thread will
listen for incoming events and perform necessary actions on the flow
if needed. If the FRCT flow is added to another user fset, it will be
handled by that user fset (and if the flow is removed from a user
fset, it will be re-added to the set with id 0 to be handled by the
rx_flow thread. The flow monitoring is handled by the same thread,
replacing the previous monitoring thread.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Now the instance keeps all flows for an application in a linked list
to easily iterate over all allocated flows, which is needed by the
keepalive monitoring. This is more efficient that tracking min and max
fd.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
| |
We don't need to iterate fsets anymore since the removal of fset_keepalive.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The frcti_filter was reading raw data from the buffers, causing the
frcti_rcv to operate directly on encrypted packets. It decrypt and
filter for invalid packets. I moved the function from frct to the
fqueue implementation and renamed it fqueue_filter as it filters
fqueues. Should be extended to filter out keepalives on non-FRCT
flows, as these will now still cause spurious wakeups.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds a monitoring thread to handle flow keepalive management in
the application and removes the thread interruptions to schedule FRCT
calls within the regular IPC calls.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reading/writing to (N + 1)-flows from the IPCP was using a raw QoS flow
to bypass some functions in the ipcp_flow_read call. But this call was
broken for keepalive packets. Fixing the ipcp_flow_read call for
(N - 1) flows causes the IPCPs to drop 0-byte keepalive packets coming from
(N + 1) client flows.
>From now on, there is a dedicated call for (N + 1) reads/writes from
the IPCPs that's more efficient and cleaner. The (N + 1) flow internal
QoS is now also defaulted to a qos_np1 qosspec, instead of tampering
with the qosspec requested by the (N + 1) client.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first step moving away from scheduling the FRCT and flow
monitoring functions as part of the IPC calls (flow_read / flow_write
/ fevent) and towards the more scalable (and far less complicated)
implementation to take care of these functions in separate threads.
If a process creates the first flow that requires FRCT, it will spin
up a thread to process events on the timerwheel (retransmissions and
delayed ACKs). This single thread lives until the last flow with FRCT
is deallocated.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
The creation of FRCT instances (if needed) is now part of flow_init()
call instead of an addition after the flow is initialized.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
Writing valid packets to the rbuff (add crc check, encrypt) is now
extracted into a function.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
Prog name is not used anymore, probably a remnant from the early days,
when we were passing rina_name_t tuples all over the place.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reading packets from the rbuff and checking their validity (non-zero
size, pass crc check, pass decryption) is now extracted into a
function.
Also adds a function to get the length of an sdu_du_buff instead of
subtracting the tail and head pointers.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
The fset add function was notifying for each packet already stored in
the rx rbuff, which isn't needed.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The maximum packet lifetime (MPL) is a property of the flow that needs
to be passed to the reliable transmission protocol (FRCP) for its
correct operation. Previously, the value of MPL was set fixed as one
of the (fixed) Delta-t parameters. This patch makes the MPL a property
of the layer, and it can now be set per layer-type at build time.
This is a step towards a proper MPL estimator in the flow allocator.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The parameters were set directly from the build configs. A first step
to making FRCP configurable at runtime, is to pass the parameters to
the frcti_create() function.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
| |
If the keepalive would underflow if set to 1-3 ms.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
| |
The rib_init return value wasn't checked.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
Bare FRCP messages (ACKs without data, Rendez-vous packets) were not
encrypted on encrypted flows, causing the receiver to fail decryption.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The qosspec_t now has a timeout value that sets the timeout value of
the flow. Flows with a peer that has timed out will now return
-EFLOWPEER on flow_read() or flow_write().
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds flow liveness monitoring for flows, with a fixed timeout of
120s. I will make it configurable at flow allocation later on (timeout
needs to be communicated to the peer). If one peer dies, or doesn't
call any IPC calls (flow_write/flow_read/fevent) it will stop sending
keepalives and the other peer's read/writes will error on an
-EFLOWDOWN after the timeout expires.
Packets without a payload (0 length packets) are interpreted as
keepalive packets for the flow. They can be sent from any application,
but they will not trigger a message read at the receiver side (0 as a
return value on flow_read indicates a previous partial read has
completed at exactly the buffer size).
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The flow_set will now keep a list of the flows in the set, this makes
it more efficient to iterate over the flows. Extending the public API
for fset_t with an iterator will also be useful.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fccntl call FRCTSFLAGS was using a pointer to a flags so set
flags, which should just be a regular uint16_t.
For instance, the FRCTLINGER flags can now be turned off using
fccntl(fd, FRCTSFLAGS, FRCTFRESCNTL | FRCTFRTX)
leaving only resource control (flow control, FRCTFRESCNTL) and
retransmission enabled. Note that retransmission (FRCTFRTX) can't be
enabled or disabled on a live flow, it will be set on flow allocation.
Updates the man page for fccntl to add these FRCT options.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is a fix to wait for outstanding retransmissions when a flow is
deallocated. Instead of waiting the full timeout, it will now wait in
the same tic increments used within FRCT. Bit of a stopgap at the
moment, FRCT and the flows are in need of a serious refactor.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
There was some leftover code in dev.c wrt to the process RIB that is
not needed anymore.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will skip rib_init() at __init() for IPCPs (or at least,
processes that have "ipcpd" in the executable name). The previous code
tried to unmount the generic mount and then remount under the ipcp
name, but it often failed because fuse_mount() is asynchronous and the
mount was not up at the time of the unmount() call. Renaming the mount
instead of unmounting failed for the same reason. This is a better
fix for now.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Application flows can now be monitored from the RIB, exposing FRCT
statistics (window edges, retransmission timeout, rtt estimate, etc).
Application RIB requires user permissions to be able to access
/dev/fuse.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This add an ouroboros/pthread.h header that wraps the
pthread_..._unlock() functions for cleanup using
pthread_cleanup_push() as this casting is not safe (and there were
definitely bad casts in the code). The close() function is now also
wrapped for cleanup in ouroboros/sockets.h.
This allows enabling more compiler checks.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
The ugent email addresses are shut down, updated to Ouroboros mail
addresses.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
| |
Happy New Year, Ouroboros!
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
DH key creation was returning -ECRYPT if opennssl is not installed,
instead of success (0).
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds congestion avoidance policies to the unicast IPCP. The
default policy is a multi-bit explicit congestion avoidance algorithm
based on data-center TCP congestion avoidance (DCTCP) to relay
information about the maximum queue depth that packets experienced to
the receiver. There's also a "nop" policy to disable congestion
avoidance for testing and benchmarking purposes.
The (initial) API for congestion avoidance policies is:
void * (* ctx_create)(void);
void (* ctx_destroy)(void * ctx);
These calls create / and or destroy a context for congestion control
for a specific flow. Thread-safety of the context is the
responsability of the flow allocator (operations on the ctx should be
performed under a lock).
ca_wnd_t (* ctx_update_snd)(void * ctx,
size_t len);
This is the sender call to update the context, and should be called
for every packet that is sent on the flow. The len parameter in this
API is the packet length, which allows calculating the bandwidth. It
returns an opaque union type that is used for the call to check/wait
if the congestion window is open or closed (and allowing to release
locks before waiting).
bool (* ctx_update_rcv)(void * ctx,
size_t len,
uint8_t ecn,
uint16_t * ece);
This is the call to update the flow congestion context on the receiver
side. It should be called for every received packet. It gets the ecn
value from the packet and its length, and returns the ECE (explicit
congestion experienced) value to be sent to the sender in case of
congestion. The boolean returned signals whether or not a congestion
update needs to be sent.
void (* ctx_update_ece)(void * ctx,
uint16_t ece);
This is the call for the sending side top update the context when it
receives an ECE update from the receiver.
void (* wnd_wait)(ca_wnd_t wnd);
This is a (blocking) call that waits for the congestion window to
clear. It should be stateless (to avoid waiting under locks). This may
change later on if passing the context is needed for different algorithms.
uint8_t (* calc_ecn)(int fd,
size_t len);
This is the call that intermediate IPCPs(routers) should use to update
the ECN field on passing packets.
The multi-bit ECN policy bases the value for the ECN field on the
depth of the rbuff queue packets will be sent on. I created another
call to grab the queue depth as fccntl is write-locking the
application. We can further optimize this to avoid most locking on the
rbuff.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the rendez-vous mechanism to handle the case where the
sending window is closed and window updates get lost. If the sending
window is closed, the sender side will send an RDVS every DELT_RDV
time (100ms), and give up after MAX_RDV time (1 second). Upon
reception of a RDVS packet, a window update is sent immediately. We
can make this much more configurable later on (build options for
defaults, fccntl for runtime tuning).
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If the sending window for flow control is closed, the sending
application will now block until the window opens. Beware that until
the rendez-vous mechanism is implemented, shutting down a server while
the client is sending (with non-timed-out blocking write) will cause
the client to hang indefinitely because its window will close.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
| |
Refactor flow_write cleanup.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows configuring some parameters for FRCP at compile time, such
as default values for Delta-t and configuration of the timerwheel. The
timerwheel will now reschedule when it fails to create a packet,
instead of setting the flow down immediately. Some new things added
are options to store packets for retransmission on the heap, and using
non-blocking calls for retransmission. The defaults do not change the
current behaviour.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
Flows should be locked when moving the timerwheel. For frcti_snd, a
rdlock is enough.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This completes the retransmission (automated repeat-request, ARQ)
logic, sending (delayed) ACK messages when needed.
On deallocation, flows will ACK try to retransmit any remaining
unacknowledged messages (unless the FRCTFLINGER flag is turned off;
this is on by default). Applications can safely shut down as soon as
everything is ACK'd (i.e. the current Delta-t run is done). The
activity timeout is now passed to the IPCP for it to sleep before
completing deallocation (and releasing the flow_id). That should be
moved to the IRMd in due time.
The timerwheel is revised to be multi-level to reduce memory
consumption. The resolution bumps by a factor of 1 << RXMQ_BUMP (16)
and each level has RXMQ_SLOTS (1 << 8) slots. The lowest level has a
resolution of (1 << RXMQ_RES) (20) ns, which is roughly a
millisecond. Currently, 3 levels are defined, so the largest delay we
can schedule at each level is:
Level 0: 256ms
Level 1: 4s
Level 2: about a minute.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the logic to send a pure acknowledgment packet without any
data to send. This needed the event filter for the fqueue, as these
non-data packets should not trigger application PKT events. The
default timeout is now 10ms, until we have FRCP tuning as part of
fccntl.
Karn's algorithm seems to be very unstable with low (sub-ms) RTT
estimates. Doubling RTO (every RTO) seems still too slow to prevent
rtx storms when the measured rtt suddenly spikes several orders of
magnitude. Just assuming the ACK'd packet is the last one transmitted
seems to be a lot more stable. It can lead to temporary
underestimation, but this is not a throughput-killer in FRCP.
Changes most time units to nanoseconds for faster computation.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is a small refactor of FRCT because I found some things a bit
hard to read. I tried to refactor frcti_rcv to always queue the
packet, but that causes unnecessarily retaking the lock when calling
queued_pdu and thus returning idx is a tiny bit faster.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The single retransmission wheel caused locking headaches as the calls
for different flows could block on the same rxmwheel. This stabilizes
the stack, but if the rdrbuff gets full there can now be big delays.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is more in line with the write() system call and prepares for
partial writes. Partial writes are disabled by default (and not yet
implemented).
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
| |
The return type was still an int, but since it returns the number of
events, it should be an ssize_t.
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The initial implementation for the ECDHE key exchange was doing the
key exchange after a flow was established. The public keys are now
sent allowg on the flow allocation messages, so that an encrypted
tunnel can be created within 1 RTT. The flow allocation steps had to
be extended to pass the opaque data ('piggybacking').
Signed-off-by: Dimitri Staessens <[email protected]>
Signed-off-by: Sander Vrijders <[email protected]>
|