1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
|
---
date: 2021-03-20
title: "How does Ouroboros relate to RINA, the Recursive InterNetwork Architecture?"
linkTitle: "Is Ouroboros RINA?"
description: "A brief history of Ouroboros"
author: Dimitri Staessens
---
```
There are two kinds of researchers: those that have implemented
something and those that have not. The latter will tell you that there
are 142 ways of doing things and that there isn't consensus on which
is best.The former will simply tell you that 141 of them don't work.
-- David Cheriton
```
When I talk to someone that's interested in Ouroboros, a question that
frequently pops up is how the project relates to the
[Recursive InterNet(work) Architecture](https://en.wikipedia.org/wiki/Recursive_Internetwork_Architecture),
or **RINA**. I usually steer away from going into the technical
aspects of how the architectures differ, mostly because not many
people know the details of how RINA works. But the origin of Ouroboros
definitely lies with our research and our experiences implementing
RINA, so it's a good question. I'll address it as best as I can,
without going overboard on a technical level. I will assume the reader
is at least somewhat familiar with RINA. Also keep in mind that both
projects are ongoing and should not be considered as "done"; things
may change in the future. These are my -- inevitably subjective and
undoubtedly somewhat inaccurate -- recollections of how it went down,
why Ouroboros exists, and how it's different from RINA.
If you're in a hurry, this is the TL;DR: We spent 4-5 years
researching RINA in EU-funded projects and understand its concepts and
ideas very well. However, we looked beyond the premises and the
us-vs-them mentality of the RINA community and areas for improvement
and further simplification. And more than a couple of things in RINA
that are just plain-old wrong. While RINA insiders may suggest that
Ouroboros is 'RINA-inspired' or use some other phrasing that
insinuates our prototype is an inferior design or some watered down
version of RINA: it is not, to the contrary, Ouroboros is the diamond
to RINA's graphite.
And a quick note here: Ouroboros _the network prototype_ has no
relation to Ouroboros _the Proof-of-Stake protocol_ in the Cardano
blockchain. That some of the Cardano guys are also interested in RINA
doesn't help to ease any confusion.
### IBBT meets RINA
I first came into contact with RINA somewhere in 2012, while working
as a senior researcher in the field of telecommunication networks at
what was then known as IBBT (I'll save you the abbreviation). IBBT
would soon be known as iMinds, and is now integrated into
[IMEC](https://www.imec-int.com). A new research project was going to
start and our research group was looking for someone to be responsible
for the IBBT contributions. That project, called
[IRATI](https://cordis.europa.eu/project/id/317814) was a relatively
short (2 years duration) project in the "Future Internet Research and
Experimentation" (FIRE) area of the _7th framework programme_ of the
European Commission. I won't go into the details and strategies of
research funding; the important thing to know is that the objectives
of FIRE are "hands-on", aimed at building and deploying Internet
technologies. Given that I had some experience deploying experiments
(at that time OpenFlow prototypes) on our lab testbeds, I listened to
the project pitch, an online presentation with Q&A given by the
project lead, Eduard Grasa from [i2cat](https://i2cat.net/), who
explained the concepts behind RINA, and got quite excited about how
elegant this all looked. So I took on the project and read John Day's
[Patterns in Network Architecture](https://www.oreilly.com/library/view/patterns-in-network/9780132252423/),
which we later usually referred to as _PNA_. It was also the time
when I was finishing my PhD thesis, so my PostDoc track was going to
be for a substantial part on computer network architecture and RINA.
Unifying
[Inter-Process Communication](https://en.wikipedia.org/wiki/Inter-process_communication)
(IPC) and networking. How exciting was that!
IRATI -- Investigating RINA as an Alternative to TCP/IP -- was
something different from the usual research projects, involving not
only some substantially new and unfamiliar ideas, but it also relied
very heavily on software development. Project work was performed as
part of PhD tracks, so who would do the work? There was a PhD student
under my guidance working mostly on OpenFlow, Sachin -- one of the
kindest people I have ever met, and now a professor at TU Dublin --
and we had a student with us, Sander Vrijders, who just finished his
master's thesis. We invited him to talk about a possible PhD track,
aligned to ongoing and upcoming projects in our group. Sander decided
to take on the challenge of IRATI and start a PhD track on RINA.
### IRATI
**IRATI** kicked off in January 2013 at i2cat in Barcelona. It was
followed by a RINA workshop, bringing the project in touch with the
RINA community, which had its epicenter at Boston University
(BU). It's where I first met John Day, who gave a 2-day in-depth
tutorial of RINA. Eduard also presented an outline of the IRATI
objectives. The project promised an implementation of RINA in Linux
_and_ FreeBSD/JunOS, with detailed comparisons of RINA against TCP/IP
in various scenarios, and also demonstrate interoperability with other
RINA prototypes: the
[TINOS prototype](https://github.com/PouzinSociety/tinos) and the
[TRIA](http://trianetworksystems.com/) prototype. IRATI would also
prepare the European FIRE testbeds for RINA experiments using the
prototype. In 2 years, on 870k Euros in research funding. A common
inside joke at project kick-off meetings in our field was to put a
wager on the number slides that the presentation deck at the final
project review meeting would differ from the slide decks presented at
the initial kick-off meeting. IRATI was _not_ going to be one of those
projects!
With the RINA community gathered at the workshop, there were initial
ideas for a follow-up research proposal to IRATI. Of course, almost
every potential participant present was on board.
Three partners were responsible for the implementation: i2cat, who had
experience on RINA; [Nextworks](https://www.nextworks.it) a
private-sector company with substantial experience on implementing
networking solutions, and iMinds/imec, bringing in our testbed
experience. Interoute (now part of [GTT](https://gtt.net)) validated
the test scenarios that we would use for evalutions. Boston University
had an advisory role in the project.
The first work was determining the software design of the
implementation. IRATI was going to build an in-kernel implementation
of RINA. A lot of the heavy lifting on the design was already done
during the project proposal preparation phase, and about 3 months into
the projects, the components to be implemented were
[well-defined](https://core.ac.uk/download/pdf/190646748.pdf).
Broadly speaking, there were 3 things to implement: the IPCPs that
make up the RINA layers (Distributed IPC Facilities, DIFs), the
component that is responsible for creating and starting these IPCPs
(the IPC manager), and the core library to communicate between these
components, called _librina_. The prototype would be built in 3 phases
over the course of 2 years.
i2cat was going to get started on most of the management parts (IPC
Manager, based on their existing Java implementation; librina,
including the Common Distributed Application Protocol (CDAP) and the
DIF management functions in the normal IPCP) and the Data Transfer
Protocol (DTP). iMinds was going to be responsible for the kernel
modules that will allow the prototype to run on top of
Ethernet. Nextworks was taking a crucial software-architectural role
on kernel development and software integration. For most of these
parts we had access to a rough draft of what they were supposed to do,
John Day's RINA reference model, which we usually referred to as _the
specs_.
i2cat had a vested interest in RINA and was putting in a lot of
development effort with 3 people working on the project: Eduard,
Leonardo Bergesio and Miquel Tarzán. Nextworks assigned
Francesco Salvestrini, an experienced kernel developer to the
project. From iMinds, the development effort would come from
Sander. My personal involvement in the project software development
was limited, as I still had other ongoing projects (at least until the
end of 2014) and my main role would be in the experimentation work,
which was only planned start after the initial development phase.
The project established efficient lines of communications, mostly
using Skype and the mailing lists and the implementation work got
underway swiftly. I have been fortunate to be a part of a couple of
projects where collaboration between partners was truly excellent, but
the level of teamwork in IRATI was unprecedented. There was a genuine
sense of excitement in everybody involved in the project.
So, Sander's first task was to implement the
[_shim DIF over Ethernet_](https://ieeexplore.ieee.org/document/6798429).
This is a Linux loadable kernel module (LKM) that wraps the Ethernet
802.1Q VLAN with a thin software layer to present itself using the
RINA API. The VLAN ID would be used as the layer name. No
functionality would be added to the existing Ethernet protocol so with
only the src and dst address fields left, this _shim DIF_ was
restricted to having only a single application registered at a time,
and to a single RINA "flow" between the endpoints. We could deploy
about 4000 of these _shim DIFs_ in parallel to support larger RINA
networks. The name resolution for endpoint applications was planned to
be using the Address Resolution Protocol (ARP), which was readily
available in the Linux kernel.
Or so we thought. The ARP implementation in the kernel assumed IPv4 as
the only L3 protocol (IPv6 doesn't use ARP), so it could not handle
the resolution of RINA _application names_ to MAC addresses, which we
needed for the shim DIF. So after some deliberation, we decided to
implement an RFC 826 compliant version of ARP to support the shim DIF.
In the meantime, we also submitted a small 3-partner project proposal
the GEANT framework, tailored to researching RINA in an NREN (National
Research and Education Networks) environment. The project was lead by
us, partnering with i2cat, and teaming up with
[TSSG](https://tssg.org/). [IRINA](https://i2cat.net/projects/irina/)
would kick off in October 2013, meaning we'd have 2 parallel projects
on RINA.
The project had made quite some progress in its first 6 months, there
were initial implementations for most of the components, and in terms
of core prototype functionality, IRATI was quickly overtaking the
existing RINA prototypes. However, the pace of development in the
kernel was slower than anticipated and some of the implementation
objectives were readjusted (and FreeBSD/JunOS was dropped in favor of
a _shim DIF for Hypervisors_). With the eye on testbed deployments,
Sander started work on the design of a second _shim DIF_, one that
would allow us to run the IRATI prototype over TCP/UDP.
In the meantime, the follow-up project that was coined during the
first RINA workshop took shape and was submitted. Lead by our IRINA
partner TSSG, it was envisioned to be a a relatively large project,
about 3.3 million Euros in EC contributions, running for 30 months and
bringing together 13 partners with the objective to build the IRATI
prototype into what was essentially a carrier network demonstrator for
RINA, adding _policies_ for mobility, security and reliability.
[**PRISTINE**](https://cordis.europa.eu/project/id/619305) got
funded. This was an enormous boon to the RINA community, but also a
bit of a shock for us as IRATI developers, as the software was already
a bit behind schedule with a third project on the horizon. The
furthest we could push forward the start of PRISTINE was January 2014.
As the IRATI project was framed within
[FIRE](https://dl.acm.org/doi/10.1145/1273445.1273460), there was a
strong implied commitment to get experimental results with the project
prototype. By the last quarter of 2013, the experimentation work got
started, and the prototype was getting its first deployment trials on
the FIRE testbeds. This move to real hardware brought more problems to
light. The network switches in the OFELIA testbed wasn't agreeing very
well with our RFC-compliant ARP implementation, dropping everything
that hadn't IPv4 as the network addresses. One of the testbeds also
relied on VLANs to seperate experiments, which didn't fare well with
our idea to (ab)use them within an experiment for the _shim
DIF_. While Sander did the development of the _shim DIFs_ using the
actual testbed hardware, other components had been developed
predominantly in a virtual machine environment and had not been
subjected to the massive parallellism that was available on dual-Xeon
hardware. The stability of the implementation had to be substantially
improved to get stable and reliable measurements. These initial trials
in deploying IRATI also showed that configuring the prototype was very
time consuming. The components used json configuration files which
were to be created for each experiment deployment, causing substantial
overhead.
The clock was ticking and while the IRATI development team was working
tirelessly to stabilize the stack, I worked on some (kernel) patches
and fixes for the testbeds so we could use VLANs (on a different
Ethertype) in our experiment. We would get deployment and stability
testing done and (internally) release _prototype 1_ before the end of
the year.
### PRISTINE
January 2014. The PRISTINE kick-off was organized together with a
workshop, where John Day presented RINA, similar to the IRATI kick-off
one year earlier, except this time it was in Dublin and the project
was substantially bigger, especially in headcount. It brought together
experts in various fields of networking with the intent of them
applying that experience into developing polcies for RINA. But many of
the participants to the PRISTINE project were very new to RINA, still
getting to grips with some of the concepts (and John didn't shy away
from making that abundantly clear).
The first couple of months of PRISTINE was mostly about getting the
participants up-to-speed with the RINA architecture and defining the
use-case, which centered on a 5G scenario with highly mobile end-users
and intelligent edge nodes. It was very elaborate, and the associated
deliverables were absolute dreadnoughts.
During this PRISTINE ramp-up phase, development of the IRATI prototype
was going on at a fierce pace. The second project brought in some
extra developers to work on the IRATI core Bernat Gaston (i2cat),
Vincenzo Maffione (Nextworks), and Douwe de Bock (a master student at
iMinds). i2cat focusing on management and flow control and was also
porting the Java user-space parts to C++, Vincenzo was focusing on the
_shim Hypervisor_, which would allow communications between processes
running over a VM host and guest, and we were building the shim layer
to run RINA over TCP and UDP.
By this time, frustrations were starting to creep in. Despite all the
effort in development, the prototype was not in a good shape. The
development effort was also highly skewed, with i2cat putting in the
bulk of the work. The research dynamic was also changing. At the start
of IRATI, there was a lot of ongoing architectural discussions about
what each component should do, to improve the _specs_, but due to the
ever increasing time pressure, the teams were working more and more in
isolation. Getting it _done_ became a lot more important than getting
it _right_.
All this development had led to very little dissemination output,
which didn't go unnoticed at project reviews. The upshot of the large
time-overlap between the two projects was that, in combination with
the IRATI design paper that got published early-on in the project, we
could afford to lose out a bit on dissemination in IRATI and try to
catch up in PRISTINE. But apart from the relatively low output in
research papers, this project had no real contributions to
standardization bodies.
In any case, the project had no choice but to push on with
development, and, despite all difficulties, somewhere mid 2014 IRATI
had most basic functionalities in place to bring the software in a
limited way into PRISTINE so it could start development of the
_PRISTINE software developement kit (SDK)_ (which was developed by
people also in IRATI).
Mostly to please the reviewers, we tried to get some standardization
going, presenting RINA at an ISO SC6 JTC1 meeting in London and also
at IETF91. Miquel and myself would continue to follow up on
standardization in SC6 WG7 on "Future Network" as part of PRISTINE,
gathering feedback on the _specs_ and getting them on the track
towards ISO RINA standards. I still have many fond memories of my
experiences discussing RINA within WG7.
The IRATI project was officially ending soon, and the development was
now focusing on the last functions of the Data Transfer Control
Protocol (DTCP) component of EFCP, such as retransmission logic
(delta-t). Other development was now shifted completely out of IRATI
towards the PRISTINE SDK.
In the meantime, we also needed some experimental
results. Experimentation with the prototype was a painful and very
time-consuming undertaking. We finally squeezed a publication at
Globecom 2014 out of some test results and could combine that with a
RINA tutorial session.
January 2015, another new year, another RINA workshop. This time in
Ghent, as part of a Flemish research project called RINAiSense --
which should be pronounced like the French _renaissance_ -- that would
investigate RINA in sensor networks (which now falls under the nomer
"Internet of Things" (IoT). After the yearly _John Day presents RINA_
sessions, this was also the time to properly introduce the IRATI
prototype to everyone with a hands-on VM tutorial session, and to
introduce [RINAsim](https://rinasim.omnetpp.org/), an OMNET++ RINA
simulator developed within PRISTINE.
After the workshop, it was time to wrap up IRATI. For an external
observer it may lack impact and show little output in publications,
and it definitely didn't deliver a convincing case for _RINA as an
alternative for TCP/IP_. But despite that, I think the project really
achieved a lot, in terms of bringing for the first time some tools
that can be used to explore RINA, and for the people that worked on
it, an incredible experience and deeps insights into computer networks
in general. This would not have been possible without the enthousiasm
and hard work put in by all those involved, but especially Eduard and
the i2cat team.
As IRINA was wrapping up, a paper on the how the _shim DIF over
Hypervisors_ could be used to [reduce complexity of VM
networking](https://ieeexplore.ieee.org/document/7452280) was
submitted for IEEE COMMAG.
We're approaching the spring of 2015, and IRATI was now officially
over, but there was no time to rest as the clock was ticking on
PRISTINE. The project was now already halfway its anticipated 30-month
runtime, and its first review, somewhere end of 2014, wasn't met with
all cheers, so we had to step up. This was also the period where some
of my other (non-RINA) projects were running out. Up to then, my
personal involvement on RINA had been on (software) design our
components, reviewing the _specs_, and the practical hands-on was in
using the software: deploying it on the testbeds and validating its
functionality. But now I could finally free up time to help Sander on
the development of the IRATI prototype.
Our main objective for PRISTINE was on _resilient routing_: making
sure the _DIF_ survives underlying link failures. This has been a
long-time research topic in our group, so we pretty much quickly know
_how_ to do it at a conceptual level. But there were three
requirements: first and foremost, it needed _scale_: we needed to be
able to run something that could be called a network, not just 3 or 4
nodes and not just a _couple_ of flows in the network. Second, it
needed _stability_: to measure the recovery time, we needed to send
packets at small but -- more importantly -- steady intervals and
thirdly, we needed measurement _tools_.
As part of IRINA, we developed a basic traffic-generator, which would
be extended for PRISTINE and tailored to suit our needs. Stability was
improving gradually over time. Our real problem was _scale_, to which
the biggest hurdle was the configuration of the IRATI stack. It was a
complete nightmare. Almost anything and everything had to be
preconfigured in _json_. I remember that by that time, Vincenzo had
developed a tool called the _demonstrator_ based on tiny buildroot VMs
to create setups for local testing, but this wasn't going to help us
deploy it on the Fed4FIRE testbeds. So Sander developed one of the
first orchestrators for RINA, called the _configurator_ for deploying
IRATI on [emulab](https://www.emulab.net/portal/frontpage.php).
Somewhere around that time, the _one-flow-only-limitation_ of the
_shim DIF over VLAN_ was showing and a _shim DIF over Ethernet Link
Layer Control (LLC)_ was drafted and developed. By mapping endpoints
to LLC Service Access Points (SAPs), this _shim DIF_ could support
parallel flows (data flows and management flows) between the client
IPCPs in the layer above.
With the PRISTINE SDK released as part of "openIRATI" somewhere after
the January workshop a good month prior, there was another influx of
code into the prototype for all the new features
(a.k.a. _policies_). Francesco, who had been managing a lot of the
software integration, was also leaving the RINA projects. This is the
point where I really noticed that Sander and Vincenzo were quickly
losing faith in the future of the IRATI codebase, and the first ideas
of branching off -- or even starting over -- began to emerge.
The next Horizon-2020-proposal deadline was also approaching, so our
struggles at that point also inspired us to propose developing a more
elaborate RINA orchestrator and make deployment and experimentation
with (open)IRATI a much more enjoyable experience. That project,
[ARCFIRE](https://ict-arcfire.eu/) would start in 2016.
Now, we were still focusing on the basics: getting link state routing
running, adding some simple _loop-free alternates_ policy to it, based
on the operation of [IP FRR](https://tools.ietf.org/html/rfc5286) and
running a bunch of flows over that network to measure packet loss when
we break a link. Sander was focusing on the policy design and
implementation, I was going to have a look at the IRATI code for
scaling up the flow counts, which needed non-blocking I/O. I won't go
into the details, but after that short hands-on stint in the IRATI
codebase, I was onboard with Sander to starting looking to options for
a RINA implementation beyond IRATI.
It was now summer 2015, PRISTINE would end in 12 months and the
project was committed to openIRATI, so at least for PRISTINE, we again
had no choice but to plow on. A couple of frustrating months lied
ahead of us, trying to get experimental results out of a prototype
that was nowhere near ready for it, and with a code base that was also
becoming so big and complex that it was impossible to fix for anyone
but the original developers. This is unfortunately the seemingly
inescapable fate of any software project whose development cycle is
heavily stressed by external deadlines, especially deadlines set
within the rigid timeline of a publicly funded research project.
By the end of summer, we were still a long way off the mark in terms
of what we hoped to achieve. The traffic generator tool and
configurator were ready, and the implementation of LFA was as good as
done, so we could deploy the machines for the use case scenarios,
which were about 20 nodes in size, on the testbeds. But the deployment
that actually worked was still limited to a 3-node PoC in a triangle
that showed the traffic getting routed over the two remaining link if
a link got severed.
In the meantime, Vincenzo had started work on his own RINA
implementation, [rlite](https://github.com/vmaffione/rlite), and
Sander and myself started discussing options on a more and more
regular basis on what to do. Should we branch off IRATI and try to
clean it up? Keep only IRATI kernel space and rewrite user space? Hop
on the _rlite_ train? Or just start over entirely? Should we go
user-space entirely or keep parts in-kernel?
In the last semester of 2015, Sander was heading for a 3-month
research stint in Boston to work on routing in RINA with John and the
BU team. By that time, we had ruled out branching off of openIRATI.
Our estimate was that cleaning up the code base would be more work
than starting over. We'd have IRATI as an upstream dependency, and
trying to merge contributions upstream would lead to endless
discussions and further hamper progress for both projects. IRATI was
out. Continuing on rlite was still a feasible option. Vincenzo was
making progress fast, and we knew he was extremely talented. But we
were also afraid of running into disagreements of how to proceed. In
the meantime, Sander's original research plans in Boston got subverted
by a 'major review' decision on the _shim Hypervisor_ article, putting
priority on getting that accepted and published. When I visited Sander
in Boston at the end of October, we were again assessing the
situation, and agreed that the best decision was to start our own
prototype, to avoid having _too many cooks in the kitchen_.
Development was not part of some funded project, so we were free to
evaluate and scrutinize all design decisions, and we could get
feedback on the RINA mailing lists on our findings. When all
considerations settled, our own RINA implementation was going to be
targeting POSIX and be user space only.
We were confident we could get it done, so we took the gamble. ARCFIRE
was going to start soon, but the first part of the project would be
tool development. Our experimentation contributions to PRISTINE were
planned to wrap up by April -- the project was planned to end in June,
but a 4-month extension pushed it to the end of October. But starting
May, we'd have some time to work on Ouroboros relatively
undisturbed. In the very worst case, if our project went down the
drain, we could still use IRATI or rlite to meet any objectives for
ARCFIRE. We named our new RINA-implementation-to-be _Ouroboros_, the
mythical snake that eats its own tail represented recursion, and also
-- with a touch of imagination -- resembles the operation of a _ring
buffer_.
### ARCFIRE
Another year, another RINA project kick-off, this time it was again in
Barcelona, but this time without a co-located workshop. ARCFIRE (like
IRATI before it) was within the FIRE framework, and the objective was
to get some experiments running with a reasonable number of nodes (on
the order of 100) to demonstrate stability and scale of the prototypes
and also to bring tooling to the RINA community. The project was
coordinated by Sven van der Meer (Ericsson), who had done significant
work on the PRISTINE use cases, and would focus on the impact of RINA
on network management. The industry-inspired use cases were brought by
Diego López (Telefónica), _acteur incontournable_ in the
Network Functions Virtualization (NFV) world. The project was of
course topped off with i2cat, Nextworks, and ourselves, as we were
somewhere in the process of integration into IMEC. The order at hand
for us was to develop an fleshed-out testbed deployment framework for
RINA, which we named [Rumba](https://gitlab.com/arcfire/rumba). (A
rhumba is a bunch of rattlesnakes, and Ouroboros is a snake, and it
was written in Python -- rhumba already existed, and rumba was an
accepted alternate spelling).
In early 2016, the RINA landscape was very different from when we
embarked on IRATI in 2013. There were 2 open source prototypes, IRATI
was the de-facto standard used in EC projects, but Vincenzo's rlite
was also becoming available at the time and would be used in
ARCFIRE. And soon, the development of a third prototype -- _ouroboros_
-- would start. External perception of RINA in the scientific
community had also been shifting, and not in a positive direction. At
the start of the IRATI project, we had the position paper with project
plans and outlines, and the papers on the _shims_ showed some ways on
how RINA could be deployed. But other articles trying to demonstrate
the benefits of RINA were -- despite all the efforts and good will of
all people involved -- lacking in quality, mostly due to the
limitations of the software. All these subpar publications did more
harm than good, as the quality of the publications rubbed off on the
perceived merits of the RINA architecture as a whole. We were always
feeling this pressure to publish _something_, _anything_ -- and
reviewers were always looking for a value proposition -- _Why is this
better than my preferred solution?_, _Compare this in depth to my
preferred solution_ -- that we simply couldn't support with data at
this point in time. And not for lack of want or a lack of trying. But
at least, ARCFIRE had at 2 years to look forward to, a focused scope
and by now, the team had a lot of experience in the bag. But for the
future of RINA, we knew the pressure was on -- this was a _now or
never_ type of situation.
### Ouroboros
We laid the first stone on Ouroboros on Friday February 12th, 2016. At
that point in time Ouroboros was still planned as a RINA
implementation, so we started from the beginning: an empty git
repository under our cursor, renewed enthousiasm in our minds, fresh
_specs_ -- still warm from the printer and smelling of toner -- in our
hands, and Sanders initial software design and APIs in colored marker
on the whiteboard. Days were long -- we still had work to do on
PRISTINE, mind you -- and evenings were short. I could now imagine the
frustration of the i2cat people, who a couple of years prior were
probably also spending their evenings and nights enthousiastically
coding on IRATI while, for us, IRATI was still a (very interesting)
job rather than a passion. We would feel no such frustrations as we
knew from the onset that the development of Ouroboros was going to be
a two-man job.
While we were spending half our days gathering and compiling results
from our _LFA_ experiments for PRISTINE, which -- fortunately or
unfortunately depending on the way I look at it -- did not result in a
publication, and half our days on the rumba framework, our early
mornings and early evenings were filled with discussions on the RINA
API used in Ouroboros. It was initially based on IRATI. Flow
allocation used source and destination _naming information_ -- 4
objects that the RINA _specs_ (correctly, might I add) say should be
named: Application Process Name, Application Process Instance Id,
Application Entity Name and Application Entity Instance Id. This
_naming information_ as in IRATI, was built into a single structure --
a 4-tuple -- and we were quickly running into a mess, because, while
these names need to be identified, they are not resolved at the same
time, nor in the same place. Putting them in a single struct and
passing that around with NULL values all the time was really ugly. The
naming API in Ouroboros changed quickly over time, initially saving
some state in an _init_ call (the naming information of the current
application, for instance) and later on removing the source naming
information from the flow allocation protocol altogether, because it
could so easily be filled with fake garbage that one shouldn't rely on
it for anything. The four-tuple was then broken up to pass two 2-tuple
name and instance-id, using one for the Process, the other for the
Entity. But we considered these changes to be just a footnote in the
RINA service definition, -- taste, one could take it or leave it, no
big deal. Little did we know that these small changes were just the
start -- the first notes of a gentle, breezy prelude that was slowly
building towards a fierce, stormy cadenza that would signify the
severance of Ouroboros from RINA almost exactly one year later.
Another such change was with the _register_ function. To be able to
reach a RINA application, you need to register it in the _DIF_. When
we were implementing this, it just struck us that this code was being
repeated over and over again in applications. And just think about it,
_how does an application know which DIFs there are in the system?_.
And if new DIFs are created while the application is running, how do I
feed that information? That's all functionality that would have to be
included in _every_ RINA application. IRATI has this as whole set of
library calls. But we did something rather different. We moved the
registering of applications _outside_ of the applications
themselves. It's _application management_, not _IPC_. Think about how
much simpler this small change makes life for an application
developer, and a network administrator. Think about how it would be if
-- in the IP world -- you could create a socket on port 80 or port 443
_from the shell_, and set options on that socket _from the shell_, and
then tell your kernel that incoming connections on that socket should
be sent to this Apache or that Nginx program _from the shell_, and all
that the Apache or Nginx developers would need to do is call accept()
and read/write/select/epoll etc calls, instead of having to handle
sockets and all their options. That's what the bind() and register()
calls in Ouroboros do for Ouroboros applications: you bind some
program to a name _from the command line_, you register that name in
the layer _from the command line_ , and all the (server) program has
to do is call _flow\_accept()_ and it will receive incoming flows. It
is this change in the RINA API that inspired us to name our very first
public presentation about Ouroboros, at FOSDEM 2018,
[IPC in 1-2-3](https://archive.fosdem.org/2018/schedule/event/ipc/).
When we tried to propose them to the RINA community, these changes
were not exactly met with cheers. The interactions with that community
was also beginning to change. RINA was the _specs_. Why are we now
again asking questions about basic things that we implemented in IRATI
years ago? IRATI shows its works. Want to change the _specs_: talk to
John.
We had also implemented our first _shim DIF_, which would allow to run
the Ouroboros prototype over UDP/IPv4. We started with a UDP shim
because there is a POSIX sockets API for UDP. Recall that we were
targeting POSIX, including FreeBSD and MacOS X to make the Ouroboros
prototype more accessible. But programming interfaces into Ethernet,
such as _raw sockets_, were not standard between operating systems, so
we would implement an Ethernet _shim DIF_ later. Now, the Ouroboros
_shim DIF_ stopped being a _shim_ pretty fast. When we were developing
the _shim DIFs_ for IRATI, there was one very important rule: we were
not allowed to add functionality to the protocol we were wrapping with
the RINA API, we could only _map_ functions that were existing in the
(Etherent/UDP) protocol. This -- was the underlying reasoning -- would
show that the protocol/layers in the current internet were
_incomplete_ layers. But that also meant that the functions that were
not present -- the flow allocator in particular -- would need to be
circumvented through manual configuration at the endpoints. We weren't
going to have any of that -- the Ouroboros IPCP daemons all implement
a flow allocator. You may also be wondering why none of the prototypes
have a _shim DIF_ directly over IP. It's perfectly possible! But the
reason is simple: it would use a non-standardized value for the
_protocol_ field in the IP header, and most IP routers simply drop
such packets.
Somewhere around April, we were starting the implementation of a
_normal_ IPCP in Ouroboros, and another RINA component was quickly
becoming a nuisance to me: the _Common Distributed Application
Protocol_ or _CDAP_. While I had no problem with the objectives of
CDAP, I was -- to put it mildly -- not a big fan of the
object-oriented paradigm that was underneath it. Its methods,
_read/write, create/destroy, start/stop_ make sense to many, but just
like the HTTP methods PUT/GET/DELETE/POST/... there is nothing
_fundamental_ about it. It might as well have just one method,
_[execute](http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html)_.
It's taste, and it definitely wasn't _my_ taste. I found that it only
proved my long-holding observation that for every engineer there are
at least three overengineers. I made a bold prediction to Sander: one
day, we would kick CDAP out of the Ouroboros prototype.
Summer was approaching again. Most of the contributions to PRISTINE
were in, so the ARCFIRE partners could start to focus on that
project. There was a risk: ARCFIRE depended on the Fed4FIRE testbeds,
and Fed4FIRE was ending and its future was not certain. The projected
target API for _rumba_ was
[jFed](https://jfed.ilabt.imec.be/).
To mitigate the risk, we made an inventory of other potential
testbeds, and to accomodate for the wait for the results of the
funding calls, we proposed (and got) an extention to ARCFIRE with 6
months to a 30-month project duration. In the end, Fed4FIRE+ was
funded, ARCFIRE had some breathing space -- after all, we had to fire
on all cylinders to get the best possible results and make a case for
RINA -- and Sander and myself had some extra time to get Ouroboros up
and running.
Sander quickly developed an Ethernet LLC _shim DIF_ based on the UDP
one, and after that, we both moved our focus on the key components in
the _normal IPCP_, implementing the full flow allocator and building
the data transfer protocol (DTP), and the routing and forwarding
functionality. CDAP was getting more and more annoying, but apart from
that, this part of the RINA _specs_ were fairly mature following the
implementation work in IRATI, and the implementation progress was
steady and rather uneventful. For now.
Work on the PRISTINE project was wrapped up, and the final
deliverables were submitted at the end of October. PRISTINE was a
tough project for us, with very little outcomes. Together with Miquel,
I did make some progress with RINA standardization in ISO
JTC1/SC6. But Sander and myself could show few research results, no
published papers where we were the main authors. PRISTINE as a whole
also fell short a bit in its main objectives, the RINA community
hadn't substantially grown, and its research results were still --
from an external vantage point -- mediocre. For us, it was a story of
trying to do too much, too soon. Everyone tried their best, and I
think we achieved what was achieveable given the time and resources we
had. The project definitely had some nice outcomes. Standardization at
least got somewhere, with a project in ISO and also some traction
within the Next Generation Protocols (NGP) group at
[ETSI](www.etsi.org). RINAsim was a nice educational tool, especially
for visualizing the operation of RINA.
Our lack of publication output was also noticed by our direct
superiors at the University, who got more and more anxious. The
relationship deteriorated steadily, we were constantly nagged about
publications, _minimum viable papers_, and the _value proposition_ of
RINA: _killer features_, _killer apps_. For us, the simplicity and
elegance of the design was all we needed as a motivation to
continue. There were some suggestions to build a simulator instead of
a full prototype. My feeling was that a simulator would be
unconvincing to show any _benefits of RINA_ -- I can't express in
words how much I hated that phrase. To prove anything, simulators need
to be validated against the real thing. And there are certain pitfalls
that can only be found in an implementation. This is the reason why I
chose that particular quote at the top of this blog post. Both parties
started to sound like broken records to eachother, every meeting was
devolving into a pointless competition in
who-knows-the-most-workarounds. As the saying goes, arguing with an
engineer is like wrestling a pig in the mud. There wasn't anything
constructive or useful to those interactions, so we stopped giving a
shit -- pardon my French. The Ouroboros prototype was coming along, we
were confident that we knew what we were doing. All we needed was time
to get it done. We'll write a paper on Ouroboros when we had one worth
writing.
By January 2017, we had a minimal working _normal_ IPCP. Sander was
looking into routing, working on a component we called the _graph
adjacency manager_ (GAM). As its name suggest, the GAM would be
responsible for managing links in the network, what would be referred
to as the _network topology_, and would get policies that instruct it
how to maintain the graph based on certain parameters. This component,
however, was short-lived and replaced by an API to connect IPCPs so
the actual layer management logic could be a standalone program
outside of the IPCPs instead of a module inside the IPCPs, which is
far more flexible.
### Ouroboros diverges from RINA
In the meantime, I was implementing and revising _CACEP_, the Common
Application Connection Establishment Phase that was accompanying CDAP
in RINA. Discussions on CACEP between Sander and myself were
interesting and sometimes heated -- whiteboard markers have
experienced flight and sudden deceleration. CDAP was supposed to
support different encoding schemes -- the OSI _presentation layer_. We
were only going to implement Google Protocol Buffers, which was also
used in IRATI, but the support for others should be there. The flow
allocator and the RIB were built on top of our CDAP
implementation. And something was becoming more and more obvious. What
we were implementing -- agreeing on protocol versions, encoding etc --
was something rather universal to all protocols. Now, you may
remember that the flow allocator is passing something -- the
information needed to connect to a specific Application Entity or
Application Entity Instace -- that was actually only needed after the
flow allocation procedure was basically established. But after a
while, it was clear to me that this information should be _there_ in
that CACEP part, and was rather universal for all application
connections, not just CDAP. After I presented this to Sander _despair_
over IRC, he actually recognized how this -- to me seemingly small --
change impacted the entire architecture. Now, I will never forget the
exchange, and I actually saved that conversation as a text file. The
date was February 24th, 2017.
```
...
<despair> nice, so then dev.h is even simpler
<despair> ae name is indeed not on the layer boundary
<dstaesse> wait why is dev.h simpler?
<despair> since ae name will be removed there
<dstaesse> no
<dstaesse> would you?
<despair> yes
<despair> nobody likes balls on the line
<despair> it's balls out
...
```
Now, RINA experts will (or should) gasp for air when reading this. It
refers to something that traces back to John's ISO JTC1/SC6 days
working on Open Systems Interconnect (OSI), when there was a heavy
discussion ongoing about the "Application Entity": _where was it
located_? If it was in the _application_, it would be outside of SC6,
which was dealing with networks, if it was in the network, it would be
dealt with _only_ in SC6. It was a turf battle battle between two ISO
groups, and because Application Entities were usually drawn as a set
of circles, and the boundary between the network and the application
as a line, that battle was internally nicknamed -- boys will be boys
-- the _balls-in, balls-out_ question. If you ever attended one of
John's presentations, he would take a short pause and then continue:
"this was the only time that a major insight came from a turf war":
_the balls were on the line_. The Application Entity needed to be
known in both the application and the network. Alas! Our
implementation was clearly showing that this was not the case. The
balls were _above_ the line, the _network_ (or more precise: the flow
allocator) doesn't need to know _anything_ about application entities!
Then and there, we had found a mistake in RINA.
Ouroboros now had a crisp and clear boundary between the flow in a
_DIF_, and any connections using that flow in the layer above. Flow
allocation creates a flow between _Application Instances_ and after
that, a connection phase would create a _connection_ between
_Application Entity Instances_. So roughtly speaking -- without the
OSI terminology -- first the network connects the running programs,
and after that, the programs decide which protocol to use (which can
be implicit). What was in the _specs_ , what the RINA API was actually
doing, was piggybacking these exchanges! Now, we have no issues with
that from an operational perspective: _en effet_, the Ouroboros flow
allocator has a _piggyback API_. But the contents of the piggybacked
information in Ouroboros is _opaque_. And all this has another, even
bigger, implication. One that I would only figure out via another line
of reasoning some time later.
With ARCFIRE rolling along and the implementation of the _rumba_
framework in full swing, Sander was working on the link-state routing
policy for Ouroboros, and I started implementing a _Distributed Hash
Table (DHT)_ that would serve as the directory -- think of the
equivalent of [DNS-SRV](https://en.wikipedia.org/wiki/SRV_record) for
a RINA DIF -- a key-value store mapping _application names_ to
_addresses_ in the layer. The link-state routing component was
something that was really closely related to the Resource Information
Base -- the RIB. That RIB was closely coupled with CDAP. Remember that
prediction that I made about a year prior, somewhere in April 2016? On
September 9th 2017, two weeks before the ARCFIRE RINA hackathon, CDAP
was removed from Ouroboros. I still consider it the most satisfying
[git commit](https://ouroboros.rocks/cgit/ouroboros/commit/?id=45c6615484ffe347654c34decb72ff1ef9bde0f3&h=master)
of my life, removing 3700 lines of utter uselessness -- CDAP got 3 out
of 4 characters right. From that day, Ouroboros could definitely not
be considered a RINA implementation anymore.
It was time to get started on the last big component: DTCP -- the
_Data Transfer Control Protocol_. When implementing this, a couple of
things were again quickly becoming clear. First, the implementation
was proving to be completely independent of DTP. The RINA _specs_, you
may recall, propose a state vector between DTP and DTCP. This solves
the _fragmentation problem_ in TCP: If an IP fragment gets lost, TCP
would resend all fragments. Hence TCP needs to know about the
fragmentation in IP and only retransmit the bytes in that fragment.
But the code was again speaking otherwise. It was basically telling
us: TCP was independent of IP. But fragmentation should be in TCP, and
IP should specify its maximum packet size. Anything else would result
in an intolerable mess. So that's how we split the _Flow and
Retransmission Control Protocol_ (FRCP) and the _Data Transfer
Protocol_ (DTP) in Ouroboros. Another mistake in RINA.
With FRCP split from DTP in roughly along the same line as TCP was
originally split from IP, we had a new question: where to put FRCP?
RINA has DTCP/DTP in the layer as EFCP. And this resulted in something
that I found rather ugly: a normal layer would "bootstrap" its traffic
(e.g. flow allocator) over its own EFCP implementation to deal with
underlying layers that do not have EFCP (such as the _shim
DIFs_). Well, fair enough I guess. But there is another thing. One
that bugged me even more. RINA has an assumption on the _system_, one
that has to be true. The EFCP implementation -- which is the guarantee
that packets are delivered, and that they are delivered in-order -- is
in the IPCP. But the application process that makes use of the IPCP is
a _different process_. So, in effect, the transfer of data, the IPC,
between the Application Process and the IPCP has to be reliable and
preserve data order _by itself_. RINA has no control over this
part. RINA is not controlling _ALL_ IPC; there is IPC _outside of
RINA_. Another way of seeing it, is like this: If a set of processes
(IPCPs) are needed to provide reliable state synchronization between
two applictions A and B, who is providing reliable state
synchronization between A and the IPCP? If it's again an IPCP,
that's _infinite_ recursion! Now -- granted -- this is a rather
_academic_ issue, because most (all?) computer hardware does provide
this kind of preserving IPC. However, to me, even theoretical issues
were issues. I wanted Ouroboros to be able to guarantee _ALL_ IPC,
even between its own components, and not make _any_ assumptions! Then,
and only then, it would be universal. Then, and only then, the
_unification of networking and IPC_ would be complete.
The third change in the architecture was the big one. And in
hindsight, we should already have seen that coming with our
realization that the application entity was _above the line_: we moved
FRCP into the application. It would be implemented in the library, not
in the IPCP, as a set of function calls, just like HTTP
libraries. Sander was initially skeptic, because to his taste, if a
single-threaded application uses the library, it should remain
single-threaded. How could it send acknowledgements, restransmit
packets etc? And the RINA specs had congestion avoidance as part of
EFCP/DTCP. At least that shouldn't be in the application!? I agreed,
but said I was confident that it would make the single-threaded thing
work by running the functionality as part of the IPC calls,
read/write/fevent. And congestion avoidance logic should be in the
IPCP in the flow allocator. And that's how it's implemented now. All
this meant that Ouroboros layers were not DIFs, and we stopped using
that terminology.
By now, the prototype was running stable enough for us to go _open
source_. We got approval from IMEC to release it to the public under
the GPLv2 / LGPL license, and in early 2018, almost exactly 2 years
after we started the project, we presented the first public version of
Ouroboros at FOSDEM 2018 in Brussels.
But we were still running against the clock. ARCFIRE was soon to end,
and Ouroboros had undergone quite some unanticipated changes that
meant the implementation was facing the reality of [Hofstadter's
Law](https://en.wikipedia.org/wiki/Hofstadter%27s_law).
We were again under pressure to get some publications out; in order to
meet ARCFIRE objectives, and Sander had to meet some publication quota
to finish his PhD. The design of Rumba was interesting enough for a
[paper](https://www.geni.net/), the implementation allowed us to
deploy 3 Recursive Network prototypes (IRATI, rlite and Ouroboros) on
testeds using different APIs: jFed for Fed4Fire and
[GENI](https://www.geni.net/), Emulab for iMinds virtual wall testbed,
QEMU using virtual machines, docker using -- well -- docker
containers, and a local option only for Ouroboros. But we needed more
publications, so for ARCFIRE Sander had implemented Loop-Free
Alternates routing in Ouroboros and was getting some larger-scale
results with them. And I reluctantly started working on a paper on
Ouroboros -- I still felt the time wasn't right, and we first needed
to have a full FRCP implementation and full congestion avoidance to
make a worthwile analysis. By then I long had a feeling that my days
at the university were numbered, it was time to move on, and I was
either leaving after submitting a publication on Ouroboros, or without
a publication on Ouroboros.
In May 2018 there was another RINA workshop, where I presented
Ouroboros. The feedback I got from John was characteristically short:
_It's stupid_.
We finished the experiments for ARCFIRE, but as with PRISTINE, the
results were not accepted for publication. During the writing of the
paper, a final realization came. We had implemented our link-state
routing a while ago, and it was doing something interesting, akin to
all link-state routing protocols: a link-state packet that came in on
some flow, was sent out on all other flows. It was -- in effect
--doing broadcast. But... OSPF is doing the same. Wait a minute. OSPF
uses a multicast IP address. But of course! Multicast wasn't what it
seemed to be. Multicast was broadcast on a layer, creating a multicast
group was enrollment in that layer. A multicast IP address is a
broadcast layer name! Let that one sink in. Based on the link-state
routing code in the _normal IPCP_, I implemented the broadcast IPCP in
a single night. The _normal IPCP_ was renamed _unicast IPCP_. It had
all fallen into place, the Ouroboros architecture was shaped.
But we had no value proposition to pitch, no value-added feature, no
killer app, no unique selling point. Elegance? I received my notice on
Christmas Eve 2018. Life as a researcher would be over. But what a
ride those last 3 years had been. I'd do the same all over again.
The [paper](https://arxiv.org/abs/2001.09707) was submitted in January
2019. We haven't received any word from it since.
With the GPL license on Ouroboros, Sander and myself decided to
continue to update the prototype and build a bit of a website for
it. So, if you made it all the way to the end of this blog post: thank
you for your interest in the project, that's why we did what we did,
and continue to do what we do.
Stay curious,
Dimitri
|