Welcome to Linux Knowledge Base and Tutorial
"The place where you learn linux"
Bread for the World

 Create an AccountHome | Submit News | Your Account  

Tutorial Menu
Linux Tutorial Home
Table of Contents

· Introduction to Operating Systems
· Linux Basics
· Working with the System
· Shells and Utilities
· Editing Files
· Basic Administration
· The Operating System
· The X Windowing System
· The Computer Itself
· Networking
· System Monitoring
· Solving Problems
· Security
· Installing and Upgrading
· Linux and Windows

Glossary
MoreInfo
Man Pages
Linux Topics
Test Your Knowledge

Site Menu
Site Map
FAQ
Copyright Info
Terms of Use
Privacy Info
Disclaimer
WorkBoard
Thanks
Donations
Advertising
Masthead / Impressum
Your Account

Communication
Feedback
Forums
Private Messages
Surveys

Features
HOWTOs
News Archive
Submit News
Topics
User Articles
Web Links

Google
Google


The Web
linux-tutorial.info

Who's Online
There are currently, 229 guest(s) and 0 member(s) that are online.

You are an Anonymous user. You can register for free by clicking here

  

tcp



SYNOPSIS

       #include <sys/socket.h>
       #include <netinet/in.h>
       #include <netinet/tcp.h>
       tcp_socket = socket(PF_INET, SOCK_STREAM, 0);


DESCRIPTION

       This  is  an implementation of the TCP protocol defined in
       RFC793, RFC1122 and RFC2001  with  the  NewReno  and  SACK
       extensions.  It provides a reliable, stream oriented, full
       duplex connection between two sockets on top of ip(7), for
       both  v4  and  v6  versions.  TCP guarantees that the data
       arrives in order and retransmits lost packets.  It  gener­
       ates  and  checks a per packet checksum to catch transmis­
       sion errors.  TCP does not preserve record boundaries.

       A fresh TCP socket has no remote or local address  and  is
       not fully specified.  To create an outgoing TCP connection
       use connect(2) to establish a connection  to  another  TCP
       socket.   To  receive new incoming connections bind(2) the
       socket first to a local address and  port  and  then  call
       listen(2)  to  put the socket into listening state.  After
       that a new socket for  each  incoming  connection  can  be
       accepted  using  accept(2).  A socket which has had accept
       or connect successfully called on it  is  fully  specified
       and may transmit data.  Data cannot be transmitted on lis­
       tening or not yet connected sockets.

       Linux supports RFC1323 TCP  high  performance  extensions.
       These  include Protection Against Wrapped Sequence Numbers
       (PAWS), Window Scaling  and  Timestamps.   Window  scaling
       allows  the  use  of large (> 64K) TCP windows in order to
       support links with high latency or bandwidth.  To make use
       of  them,  the  send  and  receive  buffer  sizes  must be
       increased.   They   can   be   set   globally   with   the
       net.ipv4.tcp_wmem  and net.ipv4.tcp_rmem sysctl variables,
       or on  individual  sockets  by  using  the  SO_SNDBUF  and
       SO_RCVBUF socket options with the setsockopt(2) call.

       The  maximum  sizes  for  socket  buffers declared via the
       SO_SNDBUF and SO_RCVBUF  mechanisms  are  limited  by  the
       global  net.core.rmem_max  and  net.core.wmem_max sysctls.
       Note that TCP actually allocates twice  the  size  of  the
       buffer  requested in the setsockopt(2) call, and so a suc­
       ceeding getsockopt(2) call will not return the  same  size
       of  buffer  as  requested  in the setsockopt(2) call.  TCP
       uses this for administrative purposes and internal  kernel
       structures,  and  the  sysctl variables reflect the larger
       sizes compared to the actual TCP windows.   On  individual
       connections,  the  socket buffer size must be set prior to
       the listen() or connect() calls in order to have  it  take
       effect. See socket(7) for more information.
       Linux 2.4 introduced a  number  of  changes  for  improved
       throughput and scaling, as well as enhanced functionality.
       Some of these features include support for zerocopy  send­
       file(2),  Explicit Congestion Notification, new management
       of TIME_WAIT sockets, keep-alive socket options  and  sup­
       port for Duplicate SACK extensions.


ADDRESS FORMATS

       TCP  is  built on top of IP (see ip(7)).  The address for­
       mats defined by ip(7) apply to  TCP.   TCP  only  supports
       point-to-point  communication; broadcasting and multicast­
       ing are not supported.


SYSCTLS

       These    variables    can    be    accessed     by     the
       /proc/sys/net/ipv4/*  files  or  with the sysctl(2) inter­
       face.  In addition, most IP sysctls also apply to TCP; see
       ip(7).

       tcp_abort_on_overflow
              Enable  resetting connections if the listening ser­
              vice is too slow and unable to keep up  and  accept
              them.  It is not enabled by default.  It means that
              if overflow occurred due to a burst, the connection
              will recover.  Enable this option _only_ if you are
              really sure that the  listening  daemon  cannot  be
              tuned  to accept connections faster.  Enabling this
              option can harm the clients of your server.

       tcp_adv_win_scale
              Count         buffering         overhead         as
              bytes/2^tcp_adv_win_scale  (if  tcp_adv_win_scale >
              0) or bytes-bytes/2^(-tcp_adv_win_scale), if it  is
              <= 0. The default is 2.

              The  socket  receive buffer space is shared between
              the application and kernel.  TCP maintains part  of
              the  buffer  as the TCP window, this is the size of
              the receive window advertised  to  the  other  end.
              The  rest of the space is used as the "application"
              buffer, used to isolate the network from scheduling
              and  application  latencies.  The tcp_adv_win_scale
              default value of 2 implies that the space used  for
              the  application  buffer  is one fourth that of the
              total.

       tcp_app_win
              This variable defines how many  bytes  of  the  TCP
              window are reserved for buffering overhead.

              A  maximum  of (window/2^tcp_app_win, mss) bytes in
              the window are reserved for the application buffer.

       tcp_fack
              Enable  TCP Forward Acknowledgement support.  It is
              enabled by default.

       tcp_fin_timeout
              How many seconds to wait for  a  final  FIN  packet
              before  the  socket  is  forcibly  closed.  This is
              strictly a violation of the TCP specification,  but
              required   to   prevent   denial-of-service   (DoS)
              attacks.  The default value in 2.4 kernels  is  60,
              down from 180 in 2.2.

       tcp_keepalive_intvl
              The   number  of  seconds  between  TCP  keep-alive
              probes.  The default value is 75 seconds.

       tcp_keepalive_probes
              The maximum number of TCP keep-alive probes to send
              before  giving  up and killing the connection if no
              response is  obtained  from  the  other  end.   The
              default value is 9.

       tcp_keepalive_time
              The number of seconds a connection needs to be idle
              before TCP begins sending  out  keep-alive  probes.
              Keep-alives  are  only  sent  when the SO_KEEPALIVE
              socket option is enabled.   The  default  value  is
              7200 seconds (2 hours).  An idle connection is ter­
              minated after approximately an additional  11  min­
              utes  (9  probes  an  interval of 75 seconds apart)
              when keep-alive is enabled.

              Note that underlying connection tracking mechanisms
              and application timeouts may be much shorter.

       tcp_max_orphans
              The maximum number of orphaned (not attached to any
              user file handle) TCP sockets allowed in  the  sys­
              tem.   When  this  number is exceeded, the orphaned
              connection is reset and a warning is printed.  This
              limit  exists  only  to prevent simple DoS attacks.
              Lowering this limit  is  not  recommended.  Network
              conditions might require you to increase the number
              of orphans allowed, but note that each  orphan  can
              eat  up to ~64K of unswappable memory.  The default
              initial value is set equal to the kernel  parameter
              NR_FILE.   This initial default is adjusted depend­
              ing on the memory in the system.

       tcp_max_syn_backlog
              The maximum number of  queued  connection  requests
              which  have  still  not received an acknowledgement
              prevent  simple  DoS attacks.  The default value of
              NR_FILE*2 is adjusted depending on  the  memory  in
              the system.  If this number is exceeded, the socket
              is closed and a warning is printed.

       tcp_mem
              This is a vector of  3  integers:  [low,  pressure,
              high].   These  bounds are used by TCP to track its
              memory usage.  The defaults are calculated at  boot
              time from the amount of available memory.

              low  -  TCP  doesn't regulate its memory allocation
              when the number of pages it has allocated  globally
              is below this number.

              pressure  -  when the amount of memory allocated by
              TCP exceeds this number of pages, TCP moderates its
              memory  consumption.  This memory pressure state is
              exited once the number  of  pages  allocated  falls
              below the low mark.

              high  - the maximum number of pages, globally, that
              TCP will allocate.  This value overrides any  other
              limits imposed by the kernel.

       tcp_orphan_retries
              The  maximum  number  of attempts made to probe the
              other end of a connection which has been closed  by
              our end.  The default value is 8.

       tcp_reordering
              The  maximum  a  packet  can  be reordered in a TCP
              packet stream without TCP assuming packet loss  and
              going  into  slow  start.  The default is 3.  It is
              not advisable to change this  number.   This  is  a
              packet reordering detection metric designed to min­
              imize unnecessary back off and retransmits provoked
              by reordering of packets on a connection.

       tcp_retrans_collapse
              Try  to  send full-sized packets during retransmit.
              This is enabled by default.

       tcp_retries1
              The number of times TCP will attempt to  retransmit
              a  packet  on  an  established connection normally,
              without the extra effort  of  getting  the  network
              layers  involved.   Once  we  exceed this number of
              retransmits, we first have the network layer update
              the  route  if possible before each new retransmit.
              The default is the RFC specified minimum of 3.

              of the TIME_WAIT period.

       tcp_rmem
              This is a vector  of  3  integers:  [min,  default,
              max].  These parameters are used by TCP to regulate
              receive buffer sizes.  TCP dynamically adjusts  the
              size of the receive buffer from the defaults listed
              below, in the  range  of  these  sysctl  variables,
              depending on memory available in the system.

              min  -  minimum  size of the receive buffer used by
              each TCP socket.  The default value is 4K,  and  is
              lowered  to  PAGE_SIZE bytes in low memory systems.
              This value is used to ensure that in  memory  pres­
              sure  mode,  allocations below this size will still
              succeed.  This is not used to bound the size of the
              receive   buffer  declared  using  SO_RCVBUF  on  a
              socket.

              default - the default size of  the  receive  buffer
              for  a  TCP socket.  This value overwrites the ini­
              tial default buffer size from  the  generic  global
              net.core.rmem_default  defined  for  all protocols.
              The default value is 87380 bytes, and is lowered to
              43689  in  low  memory  systems.  If larger receive
              buffer sizes are  desired,  this  value  should  be
              increased (to affect all sockets).  To employ large
              TCP windows, the  net.ipv4.tcp_window_scaling  must
              be enabled (default).

              max  -  the maximum size of the receive buffer used
              by each TCP socket.  This value does  not  override
              the  global net.core.rmem_max.  This is not used to
              limit the size of the receive buffer declared using
              SO_RCVBUF  on  a  socket.   The  default  value  of
              87380*2 bytes is lowered to  87380  in  low  memory
              systems.

       tcp_sack
              Enable  RFC2018 TCP Selective Acknowledgements.  It
              is enabled by default.

       tcp_stdurg
              Enable the strict RFC793 interpretation of the  TCP
              urgent-pointer  field.   The  default is to use the
              BSD-compatible  interpretation   of   the   urgent-
              pointer,  pointing  to  the  first  byte  after the
              urgent data.  The RFC793 interpretation is to  have
              it point to the last byte of urgent data.  Enabling
              this option may lead  to  interoperatibility  prob­
              lems.

              such  as TCP extensions.  It can cause problems for
              clients and relays.  It is  not  recommended  as  a
              tuning mechanism for heavily loaded servers to help
              with overloaded or misconfigured  conditions.   For
              recommended  alternatives  see tcp_max_syn_backlog,
              tcp_synack_retries, tcp_abort_on_overflow.

       tcp_syn_retries
              The maximum number of times  initial  SYNs  for  an
              active  TCP  connection attempt will be retransmit­
              ted.  This value should not  be  higher  than  255.
              The  default  value  is  5,  which  corresponds  to
              approximately 180 seconds.

       tcp_timestamps
              Enable RFC1323 TCP timestamps.  This is enabled  by
              default.

       tcp_tw_recycle
              Enable  fast recycling of TIME-WAIT sockets.  It is
              not enabled by default.  Enabling  this  option  is
              not  recommended  since  this  causes problems when
              working with NAT (Network Address Translation).

       tcp_window_scaling
              Enable RFC1323 TCP window scaling.  It  is  enabled
              by default.  This feature allows the use of a large
              window (> 64K) on  a  TCP  connection,  should  the
              other  end support it.  Normally, the 16 bit window
              length field in the TCP header  limits  the  window
              size to less than 64K bytes.  If larger windows are
              desired, applications  can  increase  the  size  of
              their  socket buffers and the window scaling option
              will be employed.  If  tcp_window_scaling  is  dis­
              abled,  TCP  will  not  negotiate the use of window
              scaling with the other end during connection setup.

       tcp_wmem
              This  is  a  vector  of  3 integers: [min, default,
              max].  These parameters are used by TCP to regulate
              send  buffer  sizes.   TCP  dynamically adjusts the
              size of the send buffer  from  the  default  values
              listed  below,  in  the range of these sysctl vari­
              ables, depending on memory available.

              min - minimum size of the send buffer used by  each
              TCP  socket.   The default value is 4K bytes.  This
              value is used to ensure  that  in  memory  pressure
              mode,  allocations  below this size will still suc­
              ceed.  This is not used to bound the  size  of  the
              send buffer declared using SO_SNDBUF on a socket.

              SO_SNDBUF  on  a socket.  The default value is 128K
              bytes.  It is lowered to 64K depending on the  mem­
              ory available in the system.


SOCKET OPTIONS

       To  set  or get a TCP socket option, call getsockopt(2) to
       read or setsockopt(2) to write the option with the  option
       level  argument  set to SOL_TCP.  In addition, most SOL_IP
       socket options are valid on TCP sockets. For more informa­
       tion see ip(7).

       TCP_CORK
              If  set, don't send out partial frames.  All queued
              partial frames are sent when the option is  cleared
              again.   This  is  useful  for  prepending  headers
              before calling sendfile(2), or for throughput opti­
              mization.   This  option  cannot  be  combined with
              TCP_NODELAY.  This option should  not  be  used  in
              code intended to be portable.

       TCP_DEFER_ACCEPT
              Allows  a  listener  to  be awakened only when data
              arrives on the  socket.   Takes  an  integer  value
              (seconds),  this  can  bound  the maximum number of
              attempts TCP will make to complete the  connection.
              This  option should not be used in code intended to
              be portable.

       TCP_INFO
              Used to collect information about this socket.  The
              kernel  returns a struct tcp_info as defined in the
              file /usr/include/linux/tcp.h.  This option  should
              not be used in code intended to be portable.

       TCP_KEEPCNT
              The  maximum  number of keepalive probes TCP should
              send before dropping the connection.   This  option
              should not be used in code intended to be portable.

       TCP_KEEPIDLE
              The time  (in  seconds)  the  connection  needs  to
              remain  idle  before  TCP  starts sending keepalive
              probes, if the socket option SO_KEEPALIVE has  been
              set on this socket.  This option should not be used
              in code intended to be portable.

       TCP_KEEPINTVL
              The time (in seconds) between individual  keepalive
              probes.   This  option  should  not be used in code
              intended to be portable.

       TCP_LINGER2
              mum bounds over the value provided.

       TCP_NODELAY
              If  set,  disable  the Nagle algorithm.  This means
              that segments are always sent as soon as  possible,
              even if there is only a small amount of data.  When
              not set, data is buffered until there is  a  suffi­
              cient amount to send out, thereby avoiding the fre­
              quent sending of small packets,  which  results  in
              poor  utilization of the network.  This option can­
              not  be  used  at  the  same  time  as  the  option
              TCP_CORK.

       TCP_QUICKACK
              Enable  quickack  mode  if  set or disable quickack
              mode if cleared.  In quickack mode, acks  are  sent
              immediately,  rather  than  delayed  if  needed  in
              accordance to normal TCP operation.  This  flag  is
              not  permanent, it only enables a switch to or from
              quickack mode.  Subsequent  operation  of  the  TCP
              protocol  will once again enter/leave quickack mode
              depending on internal protocol processing and  fac­
              tors  such  as  delayed  ack timeouts occurring and
              data transfer.  This option should not be  used  in
              code intended to be portable.

       TCP_SYNCNT
              Set  the  number of SYN retransmits that TCP should
              send before aborting the attempt  to  connect.   It
              cannot  exceed 255.  This option should not be used
              in code intended to be portable.

       TCP_WINDOW_CLAMP
              Bound the size of the  advertised  window  to  this
              value.   The  kernel  imposes  a  minimum  size  of
              SOCK_MIN_RCVBUF/2.  This option should not be  used
              in code intended to be portable.


IOCTLS

       These  ioctls can be accessed using ioctl(2).  The correct
       syntax is:

              int value;
              error = ioctl(tcp_socket, ioctl_type, &value);

       SIOCINQ
              Returns the amount of queued  unread  data  in  the
              receive  buffer.  Argument is a pointer to an inte­
              ger.  The socket must not be in LISTEN state,  oth­
              erwise an error (EINVAL) is returned.

       SIOCATMARK
       ETIMEDOUT or the last received error on this connection is
       reported.

       Some applications require a  quicker  error  notification.
       This  can  be  enabled  with  the  SOL_IP level IP_RECVERR
       socket option.  When this option is enabled, all  incoming
       errors  are  immediately  passed to the user program.  Use
       this option with care - it  makes  TCP  less  tolerant  to
       routing changes and other normal network conditions.


NOTES

       When an error occurs doing a connection setup occurring in
       a  socket  write  SIGPIPE  is   only   raised   when   the
       SO_KEEPALIVE socket option is set.

       TCP  has  no real out-of-band data; it has urgent data. In
       Linux this means if the other end sends newer  out-of-band
       data the older urgent data is inserted as normal data into
       the stream (even when SO_OOBINLINE is not set). This  dif­
       fers from BSD based stacks.

       Linux uses the BSD compatible interpretation of the urgent
       pointer field by default.  This violates RFC1122,  but  is
       required  for  interoperability with other stacks.  It can
       be changed by the tcp_stdurg sysctl.


ERRORS

       EPIPE  The other end closed the socket unexpectedly  or  a
              read is executed on a shut down socket.

       ETIMEDOUT
              The other end didn't acknowledge retransmitted data
              after some time.

       EAFNOTSUPPORT
              Passed socket address type in  sin_family  was  not
              AF_INET.

       Any  errors  defined for ip(7) or the generic socket layer
       may also be returned for TCP.


BUGS

       Not all errors are documented.
       IPv6 is not described.


VERSIONS

       Support for  Explicit  Congestion  Notification,  zerocopy
       sendfile,  reordering  support  and  some  SACK extensions
       (DSACK) were  introduced  in  2.4.   Support  for  forward
       acknowledgement  (FACK),  TIME_WAIT recycling, per connec­
       tion keepalive socket options and sysctls were  introduced
       in 2.3.

       RFC793 for the TCP specification.
       RFC1122  for the TCP requirements and a description of the
       Nagle algorithm.
       RFC1323 for TCP timestamp and window scaling options.
       RFC1644 for a description of TIME_WAIT assassination  haz­
       ards.
       RFC2481 for a description of Explicit Congestion Notifica­
       tion.
       RFC2581 for TCP congestion control algorithms.
       RFC2018 and RFC2883 for SACK and extensions to SACK.

Linux Man Page              2003-08-21                     TCP(7)
  
Show your Support for the Linux Tutorial

Purchase one of the products from our new online shop. For each product you purchase, the Linux Tutorial gets a portion of the proceeds to help keep us going.


Login
Nickname

Password

Security Code
Security Code
Type Security Code


Don't have an account yet? You can create one. As a registered user you have some advantages like theme manager, comments configuration and post comments with your name.

Help if you can!


Amazon Wish List

Did You Know?
The Linux Tutorial welcomes your suggestions and ideas.


Friends



Tell a Friend About Us

Bookmark and Share



Web site powered by PHP-Nuke

Is this information useful? At the very least you can help by spreading the word to your favorite newsgroups, mailing lists and forums.
All logos and trademarks in this site are property of their respective owner. The comments are property of their posters. Articles are the property of their respective owners. Unless otherwise stated in the body of the article, article content (C) 1994-2013 by James Mohr. All rights reserved. The stylized page/paper, as well as the terms "The Linux Tutorial", "The Linux Server Tutorial", "The Linux Knowledge Base and Tutorial" and "The place where you learn Linux" are service marks of James Mohr. All rights reserved.
The Linux Knowledge Base and Tutorial may contain links to sites on the Internet, which are owned and operated by third parties. The Linux Tutorial is not responsible for the content of any such third-party site. By viewing/utilizing this web site, you have agreed to our disclaimer, terms of use and privacy policy. Use of automated download software ("harvesters") such as wget, httrack, etc. causes the site to quickly exceed its bandwidth limitation and are therefore expressly prohibited. For more details on this, take a look here

PHP-Nuke Copyright © 2004 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.
Page Generation: 0.09 Seconds