SCTP is a new transport protocol, also used for LTE Signalling S1-MME interface between eNB and MME (core network) and MME -HSS (Diameter / SCTP).
1 SCTP Protocol
SCTP Packet is located after the MAC/ IP header.
The basic SCTP Header consist of Source / Destination Ports (16 bits each), Verification Tag (32 bits) and check sum (32 bits)
Verification Tag is used by the receiver to validate the senders authenticity, this get published by each endpoint to remote end duing the 4 way handshake done initially for setting up SCTP association.
1.1 4-Way handshake Msgs
1.1.1 INIT - Contains Initiate Tag, receiver window, in/out bound streams, initial TSN
1.1.2 INIT-ACK - Contains all params same as INIT msg also contains the State Cookie
1.1.3 COOKIE-ECHO - Contains Cookie same as received in INIT-ACK
1.1.4 COOKIE-ACK - Contains nothing, used to acknowledge receipt of COOKIE-ECHO
Completion of above 4 SCTP msgs bring the SCTP association to an established state.
State Cookie- which has all necessary state and param info for the sender of INIT-ACK to create the association , along with a Message Authentication Code (MAC).
1.2 SCTP Messages
1.2.1 INIT
1.2.2 INIT-ACK
1.2.3 SACK
1.2.4 HEARTBEAT
1.2.5 HEARTBEAT-ACK
1.2.6 ABORT
1.2.7 SHUTDOWN
1.2.8 SHUTDOWN-ACK
1.2.9 ERROR
1.2.10 COOKIE-ECHO
1.2.11 COOKIE-ACK
1.2.12 SHUTDOWN-COMPLETE
1.3 SCTP APIs usages
1.3.1 sctp_bindx() API allows the user to bind a specific subset of addresses
or, if the SCTP extension described in [RFC5061] is supported, add or
delete specific addresses.
function prototype -
int sctp_bindx(int sd, struct sockaddr *addrs, int addrcnt, int flags);
1.3.2 After an association is established on a one-to-many style socket,
the application may wish to branch off the association into a
separate socket/file descriptor.
This is particularly when new clients getting controlled a centerlized
under the original one-to-many style socket but branch off those
associations carrying high volume data traffic into their own
separate socket descriptors.
use sctp_peeloff() API to branch off an association
into a separate socket (Note the semantics are somewhat changed from
the traditional one-to-one style accept() call). Note that the new
socket is a one-to-one style socket. Thus it will be confined to
operations allowed for a one-to-one style socket.
function prototype
int sctp_peeloff(int sd, sctp_assoc_t assoc_id);
1.3.3 sctp_sendmsg() API provide a way to send SCTP msg to remote end.
function prototype -
ssize_t sctp_sendmsg(int sd, const void *msg, size_t len,
const struct sockaddr *to,
socklen_t tolen,
uint32_t ppid,
uint32_t flags,
uint16_t stream_no,
uint32_t pr_value,
uint32_t context);
1.3.4 sctp_recvmsg() API provide a way to receive SCTP msg from remote.
function prototype -
ssize_t sctp_recvmsg(int sd, void *msg, size_t len,
struct sockaddr *from,
socklen_t *fromlen
struct sctp_sndrcvinfo *sinfo
int *msg_flags);
1.3.5 sctp_connectx() API provide a way to assist the user with associating to an endpoint that is
multi-homed. Much like sctp_bindx() this call allows a caller to
specify multiple addresses at which a peer can be reached. The way
the SCTP stack uses the list of addresses passed to it, to set up the association
is implementation dependent.
Addresses passed in API, does not necessarily equal the set of addresses
the peer uses for the resulted association. User can find out the set of
peer addresses, using sctp_getpaddrs() to retrieve them after the association
has been set up.
function prototype -
int sctp_connectx(int sd, struct sockaddr *addrs,
int addrcnt, sctp_assoc_t *id);
1.3.6 sctp_send() API provide a way for an application with the
sending of data without the use of the CMSG header structures.
function prototype -
ssize_t sctp_send(int sd, const void *msg, size_t len,
const struct sctp_sndrcvinfo *sinfo, int flags);
1.3.7 sctp_sendx() API provide a way for an application with the
sending of data without the use of the CMSG header structures and
also gives a list of remote addresses.
The input list of addresses serve purpose for implicit association setup.
function prototype -
ssize_t sctp_sendx(int sd, const void *msg, size_t len,
struct sockaddr *addrs, int addrcnt,
struct sctp_sndrcvinfo *sinfo, int flags);
2 Difference between SCTP/ TCP/ UDP
Connection Oriented=yes/yes/no
Full Duplex=yes/yes/yes
Reliable Data Transfer=yes/yes/no
Partial-Reliable Data Transfer=optional/no/no
Flow Control=yes/yes/no
TCP-Feindly Congestion Control=yes/yes/no
ECN Capable=yes/yes/no
Ordered Data Delivery=yes/yes/no
UnOrdered Data Delivery=yes/no/yes
Selective Acks=yes/optional/no
Path MTU Discovery=yes/yes/no
Application PDU Fragmentation=yes/yes/no
Application PDU Bundling=yes/yes/no
Presevre App PDU boundaries=yes/no/yes
Multi Streaming=yes/no/no
Multi Homing=yes/no/no
Protection to SYN Flooding Attacks=yes/no/NA
Half Closed Connections=no/yes/NA
Reachability Checks=yes/yes/no
Pseudo-Header for Checksum=no/yes/yes
Time wait State=for vtags/for 4-tuple/NA
Handshake=4-way/ 3-way/no
Authentication=optional/optional/no
CRC based Checksum=yes/no/no
3 Kernel Tunables (Sysctls)
These variables are accessed by the /proc/sys/net/sctp/* files or with the sysctl(2) interface. In addition, most common IP sysctls also apply to SCTP.
3.1 addip_enable
Enable SCTP ADDIP(Dynamic Address Reconfiguration) Support. This is off by default.
3.2 association_max_retrans
Maximum number of consecutive retransmissions to a peer before an endpoint considers that the peer is unreachable and closes the association. The default value is 10.
3.3 cookie_preserve_enable
Handle COOKIE PRESERVATIVE parameter in the INIT chunk. This is on by default.
3.4 hb_interval
This is the interval when a HEARTBEAT chunk is sent to a destination transport address to monitor the reachability of an idle destination transport address. The default is 30 seconds and is maintained in msecs.
3.5 max_burst
Maximum number of new data packets that can be sent in a burst. The default value is 4.
3.6 max_init_retransmits
Maximum number of times an INIT chunk or a COOKIE ECHO chunk is retransmitted before an endpoint aborts the initialization process and closes the association. The default value is 8.
3.7 path_max_retrans
Maximum number of consecutive retransmissions over a destination transport address of a peer endpoint before it is marked as inactive. The default value is 5.
3.8 prsctp_enable
Enable PR-SCTP. This is on by default.
3.9 rcvbuf_policy
This controls the socket receive buffer accounting policy. The default value is 0 and indicates that all the associations belonging to a socket share the same receive buffer space. When set to 1, each association will have its own receive buffer space.
3.10 rto_alpha_exp_divisor
This is the RTO.Alpha value when expressed in right shifts and is used in RTO calculations. The default value is 3.
3.11 rto_beta_exp_divisor
This is the RTO.Beta value when expressed in right shifts and is used in RTO calculations. The default value is 2.
3.12 rto_initial
This is the initial value of RTO(retransmission timeout) that is used in RTO calculations. The default value is 3 seconds and is maintained in msecs.
3.13 rto_max
This is the maximum value of RTO(retransmission timeout) that is used in RTO calculations. The default value is 60 seconds and is maintained in msecs.
3.14 rto_min
This is the minimum value of RTO(retransmission timeout) that is used in RTO calculations. The default value is 1 second and is maintained in msecs.
3.15 sack_timeout
Delayed SACK timeout. The default value is 200msecs.
3.16 sndbuf_policy
This controls the socket sendbuffer accounting policy. The default value is 0 and indicates that all the associations belonging to a socket share the same send buffer space. When set to 1, each association will have its own send buffer space.
3.17 valid_cookie_life
This is the maximum lifespan of the Cookie sent in an INIT ACK chunk. The default value is 60 secs and is maintained in msecs.
4 SCTP system level statistics
These stats variables can be accessed by the /proc/net/sctp/* files.
4.1 assocs ->
Displays the following information about the active associations. assoc ptr, sock ptr, socket style, sock state, association state, hash bucket, association id, bytes in transmit queue, bytes in receive queue, user id, inode, local port, remote port, local addresses and remote addresses.
4.2 eps ->
Displays the following information about the active endpoints. endpoint ptr, sock ptr, socket style, sock state, hash bucket, local port, user id, inode and local addresses.
4.3 snmp ->
Displays the following statistics related to SCTP states, packets and chunks.
4.3.1 SctpCurrEstab
The number of associations for which the current state is either ESTABLISHED, SHUTDOWN-RECEIVED or SHUTDOWN-PENDING.
4.3.2 SctpActiveEstabs
The number of times that associations have made a direct transition to the ESTABLISHED state from the COOKIE-ECHOED state. The upper layer initiated the association attempt.
4.3.3 SctpPassiveEstabs
The number of times that associations have made a direct transition to the ESTABLISHED state from the CLOSED state. The remote endpoint initiated the association attempt.
4.3.4 SctpAborteds
The number of times that associations have made a direct transition to the CLOSED state from any state using the primitive 'ABORT'. Ungraceful termination of the association.
4.3.5 SctpShutdowns
The number of times that associations have made a direct transition to the CLOSED state from either the SHUTDOWN-SENT state or the SHUTDOWN-ACK-SENT state. Graceful termination of the association.
4.3.6 SctpOutOfBlues
The number of out of the blue packets received by the host. An out of the blue packet is an SCTP packet correctly formed, including the proper checksum, but for which the receiver was unable to identify an appropriate association.
4.3.7 SctpChecksumErrors
The number of SCTP packets received with an invalid checksum.
4.3.8 SctpOutCtrlChunks
The number of SCTP control chunks sent (retransmissions are not included). Control chunks are those chunks different from DATA.
4.3.9 SctpOutOrderChunks
The number of SCTP ordered data chunks sent (retransmissions are not included).
4.3.10 SctpOutUnorderChunks
The number of SCTP unordered chunks(data chunks in which the U bit is set to 1) sent (retransmissions are not included).
4.3.11 SctpInCtrlChunks
The number of SCTP control chunks received (no duplicate chunks included).
4.3.12 SctpInOrderChunks
The number of SCTP ordered data chunks received (no duplicate chunks included).
4.3.13 SctpInUnorderChunks
The number of SCTP unordered chunks(data chunks in which the U bit is set to 1) received (no duplicate chunks included).
4.3.14 SctpFragUsrMsgs
The number of user messages that have to be fragmented because of the MTU.
4.3.15 SctpReasmUsrMsgs
The number of user messages reassembled, after conversion into DATA chunks.
4.3.16 SctpOutSCTPPacks
The number of SCTP packets sent. Retransmitted DATA chunks are included.
4.3.17 SctpInSCTPPacks
The number of SCTP packets received. Duplicates are included.
5 Socket Options
To set or get a SCTP socket option, call getsockopt(2) to read or setsockopt(2) to write the option with the option level argument set to SOL_SCTP.
SCTP_RTOINFO.
This option is used to get or set the protocol parameters used to initialize and bound retransmission timout(RTO). The structure sctp_rtoinfo defined in /usr/include/netinet/sctp.h is used to access and modify these parameters.
SCTP_ASSOCINFO
This option is used to both examine and set various association and endpoint parameters. The sturcture sctp_assocparams defined in /usr/include/netinet/sctp.h is used to access and modify these parameters.
SCTP_INITMSG
This option is used to get or set the protocol parameters for the default association initialization. The structure sctp_initmsg defined in /usr/include/netinet/sctp.h is used to access and modify these parameters.
Setting initialization parameters is effective only on an unconnected socket (for one-to-many style sockets only future associations are effected by the change). With one-to-one style sockets, this option is inherited by sockets derived from a listener socket.
SCTP_NODELAY
Turn on/off any Nagle-like algorithm. This means that packets are generally sent as soon as possible and no unnecessary delays are introduced, at the cost of more packets in the network. Expects an integer boolean flag.
SCTP_AUTOCLOSE
This socket option is applicable to the one-to-many style socket only. When set it will cause associations that are idle for more than the specified number of seconds to automatically close. An association being idle is defined an association that has NOT sent or received user data. The special value of 0 indicates that no automatic close of any associations should be performed. The option expects an integer defining the number of seconds of idle time before an association is closed.
SCTP_SET_PEER_PRIMARY_ADDR
Requests that the peer mark the enclosed address as the association primary. The enclosed address must be one of the association's locally bound addresses. The structure sctp_setpeerprim defined in /usr/include/netinet/sctp.h is used to make a set peer primary request.
SCTP_PRIMARY_ADDR
Requests that the local SCTP stack use the enclosed peer address as the association primary. The enclosed address must be one of the association peer's addresses. The structure sctp_prim defined in /usr/include/netinet/sctp.h is used to make a get/set primary request.
SCTP_DISABLE_FRAGMENTS
This option is a on/off flag and is passed an integer where a non-zero is on and a zero is off. If enabled no SCTP message fragmentation will be performed. Instead if a message being sent exceeds the current PMTU size, the message will NOT be sent and an error will be indicated to the user.
SCTP_PEER_ADDR_PARAMS
Using this option, applications can enable or disable heartbeats for any peer address of an association, modify an address's heartbeat interval, force a heartbeat to be sent immediately, and adjust the address's maximum number of retransmissions sent before an address is considered unreachable. The structure sctp_paddrparams defined in /usr/include/netinet/sctp.h is used to access and modify an address's parameters.
SCTP_DEFAULT_SEND_PARAM
Applications that wish to use the sendto() system call may wish to specify a default set of parameters that would normally be supplied through the inclusion of ancillary data. This socket option allows such an application to set the default sctp_sndrcvinfo structure. The application that wishes to use this socket option simply passes in to this call the sctp_sndrcvinfo structure defined in /usr/include/netinet/sctp.h. The input parameters accepted by this call include sinfo_stream, sinfo_flags, sinfo_ppid, sinfo_context, sinfo_timetolive. The user must set the sinfo_assoc_id field to identify the association to affect if the caller is using the one-to-many style.
SCTP_EVENTS
This socket option is used to specify various notifications and ancillary data the user wishes to receive. The structure sctp_event_subscribe defined in /usr/include/netinet/sctp.h is used to access or modify the events of interest to the user.
SCTP_I_WANT_MAPPED_V4_ADDR
This socket option is a boolean flag which turns on or off mapped V4 addresses. If this option is turned on and the socket is type PF_INET6, then IPv4 addresses will be mapped to V6 representation. If this option is turned off, then no mapping will be done of V4 addresses and a user will receive both PF_INET6 and PF_INET type addresses on the socket.
By default this option is turned on and expects an integer to be passed where non-zero turns on the option and zero turns off the option.
SCTP_MAXSEG
This socket option specifies the maximum size to put in any outgoing SCTP DATA chunk. If a message is larger than this size it will be fragmented by SCTP into the specified size. Note that the underlying SCTP implementation may fragment into smaller sized chunks when the PMTU of the underlying association is smaller than the value set by the user. The option expects an integer.
The default value for this option is 0 which indicates the user is NOT limiting fragmentation and only the PMTU will effect SCTP's choice of DATA chunk size.
SCTP_STATUS
Applications can retrieve current status information about an association, including association state, peer receiver window size, number of unacked data chunks, and number of data chunks pending receipt. This information is read-only. The structure sctp_status defined in /usr/include/netinet/sctp.h is used to access this information.
SCTP_GET_PEER_ADDR_INFO
Applications can retrieve information about a specific peer address of an association, including its reachability state, congestion window, and retransmission timer values. This information is read-only. The structure sctp_paddr_info defined in /usr/include/netinet/sctp.h is used to access this information.
6 Notes
Check if your Kernel supports SCTP
run command :/usr/bin/checksctp
check output : SCTP supported
run command : lsmod
check out for kernel module sctp.ko
test apps come with lksctp open source release.
sctp_darn, sctp_test
SCTP supports 2 socket flavors -
1. one to one connection - TCP type [ socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP); ]
2. one to many connection - UDP type [ socket(AF_INET, SOCK_SEQPACKET, IPPROTO_SCTP); ]
Most of common SCTP structures can be checked in /usr/include/netinet/sctp.h header file.
How to capture SCTP packets through tcpdump
iptables -A INPUT -p sctp --chunk-types any INIT,INIT_ACK
iptables -p tcp --tcp-flags SYN,FIN,ACK SYN
tcpdump -ilo -n -v 'tcp[tcpflags] & (tcp-rst) != 0' - to check TCP reset message
7 Some links
http://books.google.com/books?id=ptSC4LpwGA0C&pg=PA228&lpg=PA228&dq=sctp+max+burst&source=bl&ots=Kq4GLldqSs&sig=Xz19MnduKpD2wix8u2R0-QRlhQI&hl=en&ei=swxzStf0N8eAkQWUrYycDA&sa=X&oi=book_result&ct=result&resnum=6#v=onepage&q=sctp%20max%20burst&f=false
http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ch14_:_Linux_Firewalls_Using_iptables
http://linuxreviews.org/man/iptables/
Hope this article would help you to do jump start on SCTP
Comments
Congestion control in SCTP is at stream level, so in case a stream is blocked due to not arrival of in sequence pkts, only that stream get affected. Remaining streams will work as if nothing happened. [no congestion]
SCTP allows for data to be sent reliably but unordered. Also unordered option is at msg level.
difference in close: TCP allows the “half-closed” state, where one stays open while the remote closes. SCTP does not support this, both sides must close when the shutdown called.