uIP can be found at the uIP web page: http://www.sics.se/~adam/uip/
Compile-time configuration options
Run-time configuration functions
Device driver interface and variables used by device drivers
uIP functions called from application programs (see below) and the protosockets API and their underlying protothreads
Traditional TCP/IP implementations have required far too much resources both in terms of code size and memory usage to be useful in small 8 or 16-bit systems. Code size of a few hundred kilobytes and RAM requirements of several hundreds of kilobytes have made it impossible to fit the full TCP/IP stack into systems with a few tens of kilobytes of RAM and room for less than 100 kilobytes of code.
The uIP implementation is designed to have only the absolute minimal set of features needed for a full TCP/IP stack. It can only handle a single network interface and contains the IP, ICMP, UDP and TCP protocols. uIP is written in the C programming language.
Many other TCP/IP implementations for small systems assume that the embedded device always will communicate with a full-scale TCP/IP implementation running on a workstation-class machine. Under this assumption, it is possible to remove certain TCP/IP mechanisms that are very rarely used in such situations. Many of those mechanisms are essential, however, if the embedded device is to communicate with another equally limited device, e.g., when running distributed peer-to-peer services and protocols. uIP is designed to be RFC compliant in order to let the embedded devices to act as first-class network citizens. The uIP TCP/IP implementation that is not tailored for any specific application.
TCP provides a reliable byte stream to the upper layer protocols. It breaks the byte stream into appropriately sized segments and each segment is sent in its own IP packet. The IP packets are sent out on the network by the network device driver. If the destination is not on the physically connected network, the IP packet is forwarded onto another network by a router that is situated between the two networks. If the maximum packet size of the other network is smaller than the size of the IP packet, the packet is fragmented into smaller packets by the router. If possible, the size of the TCP segments are chosen so that fragmentation is minimized. The final recipient of the packet will have to reassemble any fragmented IP packets before they can be passed to higher layers.
The formal requirements for the protocols in the TCP/IP stack is specified in a number of RFC documents published by the Internet Engineering Task Force, IETF. Each of the protocols in the stack is defined in one more RFC documents and RFC1122 collects all requirements and updates the previous RFCs.
The RFC1122 requirements can be divided into two categories; those that deal with the host to host communication and those that deal with communication between the application and the networking stack. An example of the first kind is "A TCP MUST be able to receive a TCP option in any segment" and an example of the second kind is "There MUST be a mechanism for reporting soft TCP error conditions to the application." A TCP/IP implementation that violates requirements of the first kind may not be able to communicate with other TCP/IP implementations and may even lead to network failures. Violation of the second kind of requirements will only affect the communication within the system and will not affect host-to-host communication.
In uIP, all RFC requirements that affect host-to-host communication are implemented. However, in order to reduce code size, we have removed certain mechanisms in the interface between the application and the stack, such as the soft error reporting mechanism and dynamically configurable type-of-service bits for TCP connections. Since there are only very few applications that make use of those features they can be removed without loss of generality.
If a packet has arrived, the input handler function, uip_input(), should be invoked by the main control loop. The input handler function will never block, but will return at once. When it returns, the stack or the application for which the incoming packet was intended may have produced one or more reply packets which should be sent out. If so, the network device driver should be called to send out these packets.
Periodic timeouts are used to drive TCP mechanisms that depend on timers, such as delayed acknowledgments, retransmissions and round-trip time estimations. When the main control loop infers that the periodic timer should fire, it should invoke the timer handler function uip_periodic(). Because the TCP/IP stack may perform retransmissions when dealing with a timer event, the network device driver should called to send out the packets that may have been produced.
While uIP includes a generic checksum function, it also leaves it open for an architecture specific implementation of the two functions uip_ipchksum() and uip_tcpchksum(). The checksum calculations in those functions can be written in highly optimized assembler rather than generic C code.
While uIP implements a generic 32-bit addition, there is support for having an architecture specific implementation of the uip_add32() function.
The uIP stack does not use explicit dynamic memory allocation. Instead, it uses a single global buffer for holding packets and has a fixed table for holding connection state. The global packet buffer is large enough to contain one packet of maximum size. When a packet arrives from the network, the device driver places it in the global buffer and calls the TCP/IP stack. If the packet contains data, the TCP/IP stack will notify the corresponding application. Because the data in the buffer will be overwritten by the next incoming packet, the application will either have to act immediately on the data or copy the data into a secondary buffer for later processing. The packet buffer will not be overwritten by new packets before the application has processed the data. Packets that arrive when the application is processing the data must be queued, either by the network device or by the device driver. Most single-chip Ethernet controllers have on-chip buffers that are large enough to contain at least 4 maximum sized Ethernet frames. Devices that are handled by the processor, such as RS-232 ports, can copy incoming bytes to a separate buffer during application processing. If the buffers are full, the incoming packet is dropped. This will cause performance degradation, but only when multiple connections are running in parallel. This is because uIP advertises a very small receiver window, which means that only a single TCP segment will be in the network per connection.
In uIP, the same global packet buffer that is used for incoming packets is also used for the TCP/IP headers of outgoing data. If the application sends dynamic data, it may use the parts of the global packet buffer that are not used for headers as a temporary storage buffer. To send the data, the application passes a pointer to the data as well as the length of the data to the stack. The TCP/IP headers are written into the global buffer and once the headers have been produced, the device driver sends the headers and the application data out on the network. The data is not queued for retransmissions. Instead, the application will have to reproduce the data if a retransmission is necessary.
The total amount of memory usage for uIP depends heavily on the applications of the particular device in which the implementations are to be run. The memory configuration determines both the amount of traffic the system should be able to handle and the maximum amount of simultaneous connections. A device that will be sending large e-mails while at the same time running a web server with highly dynamic web pages and multiple simultaneous clients, will require more RAM than a simple Telnet server. It is possible to run the uIP implementation with as little as 200 bytes of RAM, but such a configuration will provide extremely low throughput and will only allow a small number of simultaneous connections.
uIP provides two APIs to programmers: protosockets, a BSD socket-like API without the overhead of full multi-threading, and a "raw" event-based API that is nore low-level than protosockets but uses less memory.
uIP is different from other TCP/IP stacks in that it requires help from the application when doing retransmissions. Other TCP/IP stacks buffer the transmitted data in memory until the data is known to be successfully delivered to the remote end of the connection. If the data needs to be retransmitted, the stack takes care of the retransmission without notifying the application. With this approach, the data has to be buffered in memory while waiting for an acknowledgment even if the application might be able to quickly regenerate the data if a retransmission has to be made.
In order to reduce memory usage, uIP utilizes the fact that the application may be able to regenerate sent data and lets the application take part in retransmissions. uIP does not keep track of packet contents after they have been sent by the device driver, and uIP requires that the application takes an active part in performing the retransmission. When uIP decides that a segment should be retransmitted, it calls the application with a flag set indicating that a retransmission is required. The application checks the retransmission flag and produces the same data that was previously sent. From the application's standpoint, performing a retransmission is not different from how the data originally was sent. Therefore the application can be written in such a way that the same code is used both for sending data and retransmitting data. Also, it is important to note that even though the actual retransmission operation is carried out by the application, it is the responsibility of the stack to know when the retransmission should be made. Thus the complexity of the application does not necessarily increase because it takes an active part in doing retransmissions.
The application sends data by using the uIP function uip_send(). The uip_send() function takes two arguments; a pointer to the data to be sent and the length of the data. If the application needs RAM space for producing the actual data that should be sent, the packet buffer (pointed to by the uip_appdata pointer) can be used for this purpose.
The application can send only one chunk of data at a time on a connection and it is not possible to call uip_send() more than once per application invocation; only the data from the last call will be sent.
The application must check the uip_rexmit() flag and produce the same data that was previously sent. From the application's standpoint, performing a retransmission is not different from how the data originally was sent. Therefor, the application can be written in such a way that the same code is used both for sending data and retransmitting data. Also, it is important to note that even though the actual retransmission operation is carried out by the application, it is the responsibility of the stack to know when the retransmission should be made. Thus the complexity of the application does not necessarily increase because it takes an active part in doing retransmissions.
If the connection has been closed by the remote end, the test function uip_closed() is true. The application may then do any necessary cleanups.
The polling event has two purposes. The first is to let the application periodically know that a connection is idle, which allows the application to close connections that have been idle for too long. The other purpose is to let the application send new data that has been produced. The application can only send data when invoked by uIP, and therefore the poll event is the only way to send data on an otherwise idle connection.
The application can check the lport field in the uip_conn structure to check to which port the new connection was connected.
The function uip_ipaddr() may be used to pack an IP address into the two element 16-bit array used by uIP to represent IP addresses.
Two examples of usage are shown below. The first example shows how to open a connection to TCP port 8080 of the remote end of the current connection. If there are not enough TCP connection slots to allow a new connection to be opened, the uip_connect() function returns NULL and the current connection is aborted by uip_abort().
void connect_example1_app(void) { if(uip_connect(uip_conn->ripaddr, HTONS(8080)) == NULL) { uip_abort(); } }
The second example shows how to open a new connection to a specific IP address. No error checks are made in this example.
void connect_example2(void) { u16_t ipaddr[2]; uip_ipaddr(ipaddr, 192,168,0,1); uip_connect(ipaddr, HTONS(8080)); }
The implementation of this application is shown below. The application is initialized with the function called example1_init() and the uIP callback function is called example1_app(). For this application, the configuration variable UIP_APPCALL should be defined to be example1_app().
void example1_init(void) { uip_listen(HTONS(1234)); } void example1_app(void) { if(uip_newdata() || uip_rexmit()) { uip_send("ok\n", 3); } }
The initialization function calls the uIP function uip_listen() to register a listening port. The actual application function example1_app() uses the test functions uip_newdata() and uip_rexmit() to determine why it was called. If the application was called because the remote end has sent it data, it responds with an "ok". If the application function was called because data was lost in the network and has to be retransmitted, it also sends an "ok". Note that this example actually shows a complete uIP application. It is not required for an application to deal with all types of events such as uip_connected() or uip_timedout().
This application is similar to the first application in that it listens to a port for incoming connections and responds to data sent to it with a single "ok". The big difference is that this application prints out a welcoming "Welcome!" message when the connection has been established.
This seemingly small change of operation makes a big difference in how the application is implemented. The reason for the increase in complexity is that if data should be lost in the network, the application must know what data to retransmit. If the "Welcome!" message was lost, the application must retransmit the welcome and if one of the "ok" messages is lost, the application must send a new "ok".
The application knows that as long as the "Welcome!" message has not been acknowledged by the remote host, it might have been dropped in the network. But once the remote host has sent an acknowledgment back, the application can be sure that the welcome has been received and knows that any lost data must be an "ok" message. Thus the application can be in either of two states: either in the WELCOME-SENT state where the "Welcome!" has been sent but not acknowledged, or in the WELCOME-ACKED state where the "Welcome!" has been acknowledged.
When a remote host connects to the application, the application sends the "Welcome!" message and sets it's state to WELCOME-SENT. When the welcome message is acknowledged, the application moves to the WELCOME-ACKED state. If the application receives any new data from the remote host, it responds by sending an "ok" back.
If the application is requested to retransmit the last message, it looks at in which state the application is. If the application is in the WELCOME-SENT state, it sends a "Welcome!" message since it knows that the previous welcome message hasn't been acknowledged. If the application is in the WELCOME-ACKED state, it knows that the last message was an "ok" message and sends such a message.
The implementation of this application is seen below. This configuration settings for the application is follows after its implementation.
struct example2_state { enum {WELCOME_SENT, WELCOME_ACKED} state; }; void example2_init(void) { uip_listen(HTONS(2345)); } void example2_app(void) { struct example2_state *s; s = (struct example2_state *)uip_conn->appstate; if(uip_connected()) { s->state = WELCOME_SENT; uip_send("Welcome!\n", 9); return; } if(uip_acked() && s->state == WELCOME_SENT) { s->state = WELCOME_ACKED; } if(uip_newdata()) { uip_send("ok\n", 3); } if(uip_rexmit()) { switch(s->state) { case WELCOME_SENT: uip_send("Welcome!\n", 9); break; case WELCOME_ACKED: uip_send("ok\n", 3); break; } } }
The configuration for the application:
#define UIP_APPCALL example2_app #define UIP_APPSTATE_SIZE sizeof(struct example2_state)
void example3_init(void) { example1_init(); example2_init(); } void example3_app(void) { switch(uip_conn->lport) { case HTONS(1234): example1_app(); break; case HTONS(2345): example2_app(); break; } }
void example4_init(void) { u16_t ipaddr[2]; uip_ipaddr(ipaddr, 192,168,0,1); uip_connect(ipaddr, HTONS(80)); } void example4_app(void) { if(uip_connected() || uip_rexmit()) { uip_send("GET /file HTTP/1.0\r\nServer:192.186.0.1\r\n\r\n", 48); return; } if(uip_newdata()) { device_enqueue(uip_appdata, uip_datalen()); if(device_queue_full()) { uip_stop(); } } if(uip_poll() && uip_stopped()) { if(!device_queue_full()) { uip_restart(); } } }
When the connection has been established, an HTTP request is sent to the server. Since this is the only data that is sent, the application knows that if it needs to retransmit any data, it is that request that should be retransmitted. It is therefore possible to combine these two events as is done in the example.
When the application receives new data from the remote host, it sends this data to the device by using the function device_enqueue(). It is important to note that this example assumes that this function copies the data into its own buffers. The data in the uip_appdata buffer will be overwritten by the next incoming packet.
If the device's queue is full, the application stops the data from the remote host by calling the uIP function uip_stop(). The application can then be sure that it will not receive any new data until uip_restart() is called. The application polling event is used to check if the device's queue is no longer full and if so, the data flow is restarted with uip_restart().
struct example5_state { char *dataptr; unsigned int dataleft; }; void example5_init(void) { uip_listen(HTONS(80)); uip_listen(HTONS(81)); } void example5_app(void) { struct example5_state *s; s = (struct example5_state)uip_conn->appstate; if(uip_connected()) { switch(uip_conn->lport) { case HTONS(80): s->dataptr = data_port_80; s->dataleft = datalen_port_80; break; case HTONS(81): s->dataptr = data_port_81; s->dataleft = datalen_port_81; break; } uip_send(s->dataptr, s->dataleft); return; } if(uip_acked()) { if(s->dataleft < uip_mss()) { uip_close(); return; } s->dataptr += uip_conn->len; s->dataleft -= uip_conn->len; uip_send(s->dataptr, s->dataleft); } }
The application state consists of a pointer to the data that should be sent and the size of the data that is left to send. When a remote host connects to the application, the local port number is used to determine which file to send. The first chunk of data is sent using uip_send(). uIP makes sure that no more than MSS bytes of data is actually sent, even though s->dataleft may be larger than the MSS.
The application is driven by incoming acknowledgments. When data has been acknowledged, new data can be sent. If there is no more data to send, the connection is closed using uip_close().
The uIP event handler function is shown below.
void example6_app(void) { if(uip_aborted()) { aborted(); } if(uip_timedout()) { timedout(); } if(uip_closed()) { closed(); } if(uip_connected()) { connected(); } if(uip_acked()) { acked(); } if(uip_newdata()) { newdata(); } if(uip_rexmit() || uip_newdata() || uip_acked() || uip_connected() || uip_poll()) { senddata(); } }
The function starts with dealing with any error conditions that might have happened by checking if uip_aborted() or uip_timedout() are true. If so, the appropriate error function is called. Also, if the connection has been closed, the closed() function is called to the it deal with the event.
Next, the function checks if the connection has just been established by checking if uip_connected() is true. The connected() function is called and is supposed to do whatever needs to be done when the connection is established, such as intializing the application state for the connection. Since it may be the case that data should be sent out, the senddata() function is called to deal with the outgoing data.
The following very simple application serves as an example of how the application handler functions might look. This application simply waits for any data to arrive on the connection, and responds to the data by sending out the message "Hello world!". To illustrate how to develop an application state machine, this message is sent in two parts, first the "Hello" part and then the "world!" part.
#define STATE_WAITING 0 #define STATE_HELLO 1 #define STATE_WORLD 2 struct example6_state { u8_t state; char *textptr; int textlen; }; static void aborted(void) {} static void timedout(void) {} static void closed(void) {} static void connected(void) { struct example6_state *s = (struct example6_state *)uip_conn->appstate; s->state = STATE_WAITING; s->textlen = 0; } static void newdata(void) { struct example6_state *s = (struct example6_state *)uip_conn->appstate; if(s->state == STATE_WAITING) { s->state = STATE_HELLO; s->textptr = "Hello "; s->textlen = 6; } } static void acked(void) { struct example6_state *s = (struct example6_state *)uip_conn->appstate; s->textlen -= uip_conn->len; s->textptr += uip_conn->len; if(s->textlen == 0) { switch(s->state) { case STATE_HELLO: s->state = STATE_WORLD; s->textptr = "world!\n"; s->textlen = 7; break; case STATE_WORLD: uip_close(); break; } } } static void senddata(void) { struct example6_state *s = (struct example6_state *)uip_conn->appstate; if(s->textlen > 0) { uip_send(s->textptr, s->textlen); } }
The application state consists of a "state" variable, a "textptr" pointer to a text message and the "textlen" length of the text message. The "state" variable can be either "STATE_WAITING", meaning that the application is waiting for data to arrive from the network, "STATE_HELLO", in which the application is sending the "Hello" part of the message, or "STATE_WORLD", in which the application is sending the "world!" message.
The application does not handle errors or connection closing events, and therefore the aborted(), timedout() and closed() functions are implemented as empty functions.
The connected() function will be called when a connection has been established, and in this case sets the "state" variable to be "STATE_WAITING" and the "textlen" variable to be zero, indicating that there is no message to be sent out.
When new data arrives from the network, the newdata() function will be called by the event handler function. The newdata() function will check if the connection is in the "STATE_WAITING" state, and if so switches to the "STATE_HELLO" state and registers a 6 byte long "Hello " message with the connection. This message will later be sent out by the senddata() function.
The acked() function is called whenever data that previously was sent has been acknowleged by the receiving host. This acked() function first reduces the amount of data that is left to send, by subtracting the length of the previously sent data (obtained from "uip_conn->len") from the "textlen" variable, and also adjusts the "textptr" pointer accordingly. It then checks if the "textlen" variable now is zero, which indicates that all data now has been successfully received, and if so changes application state. If the application was in the "STATE_HELLO" state, it switches state to "STATE_WORLD" and sets up a 7 byte "world!\n" message to be sent. If the application was in the "STATE_WORLD" state, it closes the connection.
Finally, the senddata() function takes care of actually sending the data that is to be sent. It is called by the event handler function when new data has been received, when data has been acknowledged, when a new connection has been established, when the connection is polled because of inactivity, or when a retransmission should be made. The purpose of the senddata() function is to optionally format the data that is to be sent, and to call the uip_send() function to actually send out the data. In this particular example, the function simply calls uip_send() with the appropriate arguments if data is to be sent, after checking if data should be sent out or not as indicated by the "textlen" variable.
It is important to note that the senddata() function never should affect the application state; this should only be done in the acked() and newdata() functions.
This section gives detailed information on the specific protocol implementations in uIP.
The current implementation only has a single buffer for holding packets to be reassembled, and therefore does not support simultaneous reassembly of more than one packet. Since fragmented packets are uncommon, this ought to be a reasonable decision. Extending the implementation to support multiple buffers would be straightforward, however.
The ICMP implementation in uIP is very simple as itis restricted to only implement ICMP echo messages. Replies to echo messages are constructed by simply swapping the source and destination IP addresses of incoming echo requests and rewriting the ICMP header with the Echo-Reply message type. The ICMP checksum is adjusted using standard techniques (see RFC1624).
Since only the ICMP echo message is implemented, there is no support for Path MTU discovery or ICMP redirect messages. Neither of these is strictly required for interoperability; they are performance enhancement mechanisms.
The sliding window algorithm uses a lot of 32-bit operations and because 32-bit arithmetic is fairly expensive on most 8-bit CPUs, uIP does not implement it. Also, uIP does not buffer sent packets and a sliding window implementation that does not buffer sent packets will have to be supported by a complex application layer. Instead, uIP allows only a single TCP segment per connection to be unacknowledged at any given time.
It is important to note that even though most TCP implementations use the sliding window algorithm, it is not required by the TCP specifications. Removing the sliding window mechanism does not affect interoperability in any way.
The RTT estimation in uIP is implemented using TCP's periodic timer. Each time the periodic timer fires, it increments a counter for each connection that has unacknowledged data in the network. When an acknowledgment is received, the current value of the counter is used as a sample of the RTT. The sample is used together with Van Jacobson's standard TCP RTT estimation function to calculate an estimate of the RTT. Karn's algorithm is used to ensure that retransmissions do not skew the estimates.
As uIP does not keep track of packet contents after they have been sent by the device driver, uIP requires that the application takes an active part in performing the retransmission. When uIP decides that a segment should be retransmitted, it calls the application with a flag set indicating that a retransmission is required. The application checks the retransmission flag and produces the same data that was previously sent. From the application's standpoint, performing a retransmission is not different from how the data originally was sent. Therefore the application can be written in such a way that the same code is used both for sending data and retransmitting data. Also, it is important to note that even though the actual retransmission operation is carried out by the application, it is the responsibility of the stack to know when the retransmission should be made. Thus the complexity of the application does not necessarily increase because it takes an active part in doing retransmissions.
In uIP, the application cannot send more data than the receiving host can buffer. And application cannot send more data than the amount of bytes it is allowed to send by the receiving host. If the remote host cannot accept any data at all, the stack initiates the zero window probing mechanism.
Since uIP only handles one in-flight TCP segment per connection, the amount of simultaneous segments cannot be further limited, thus the congestion control mechanisms are not needed.
In many TCP implementations, including the BSD implementation, the urgent data feature increases the complexity of the implementation because it requires an asynchronous notification mechanism in an otherwise synchronous API. As uIP already use an asynchronous event based API, the implementation of the urgent data feature does not lead to increased complexity.
A small embedded device does not have the necessary processing power to have multiple protection domains and the power to run a multitasking operating system. Therefore there is no need to copy data between the TCP/IP stack and the application program. With an event based API there is no context switch between the TCP/IP stack and the applications.
In such limited systems, the TCP/IP processing overhead is dominated by the copying of packet data from the network device to host memory, and checksum calculation. Apart from the checksum calculation and copying, the TCP processing done for an incoming packet involves only updating a few counters and flags before handing the data over to the application. Thus an estimate of the CPU overhead of our TCP/IP implementations can be obtained by calculating the amount of CPU cycles needed for the checksum calculation and copying of a maximum sized packet.
A TCP sender such as uIP that only handles a single outstanding TCP segment will interact poorly with the delayed acknowledgment algorithm. Because the receiver only receives a single segment at a time, it will wait as much as 500 ms before an acknowledgment is sent. This means that the maximum possible throughput is severely limited by the 500 ms idle time.
Thus the maximum throughput equation when sending data from uIP will be $p = s / (t + t_d)$ where $s$ is the segment size and $t_d$ is the delayed acknowledgment timeout, which typically is between 200 and 500 ms. With a segment size of 1000 bytes, a round-trip time of 40 ms and a delayed acknowledgment timeout of 200 ms, the maximum throughput will be 4166 bytes per second. With the delayed acknowledgment algorithm disabled at the receiver, the maximum throughput would be 25000 bytes per second.
It should be noted, however, that since small systems running uIP are not very likely to have large amounts of data to send, the delayed acknowledgmen t throughput degradation of uIP need not be very severe. Small amounts of data sent by such a system will not span more than a single TCP segment, and would therefore not be affected by the throughput degradation anyway.
The maximum throughput when uIP acts as a receiver is not affected by the delayed acknowledgment throughput degradation.