Please note, I am aware about streaming nature of TCP connection, my question not related to such kind of things. It rather about bug suspicion of linux sockets implementation.
Update: Taking comments into account, I updated my code a little bit to check the return value of recv() not only to -1 but to any negative value. That was just in case. The results are the same.
I have very simple TCP client/server application written in C.
Full code of this project is available on github.
Client side run multiple parallel threads, each of thread do following:
- open socket
- connect this socket to server
- write 16 bytes of predefined data pattern to the socket by pieces of random length
- close socket
- repeat steps 1 to 4 N times
static size_t send_ex(int fd, const uint8_t *buff, size_t len, bool by_frags)
{
if ( by_frags )
{
size_t chunk_len, pos;
size_t res;
for ( pos = 0; pos < len; )
{
chunk_len = (size_t) random();
chunk_len %= (len - pos);
chunk_len++;
res = send(fd, (const char *) &buff[pos], chunk_len, 0);
if ( res != chunk_len) {
return (size_t) -1;
}
pos += chunk_len;
}
return len;
}
return send(fd, buff, len, 0);
}
static void *connection_task(void *arg)
{
connection_ctx_t *ctx = (connection_ctx_t *) arg;
uint32_t buff[4] = {0xAA55AA55, 0x12345678, 0x12345678, 0x12345678};
int res, fd, i;
for ( i = 0; i < count; i++ )
{
fd = socket(AF_INET, SOCK_STREAM, 0);
if ( fd < 0 ) {
fprintf(stderr, "Can't create socket!n");
break;
}
res = connect(fd, (struct sockaddr *) ctx->serveraddr, sizeof(struct sockaddr_in));
if ( res < 0 ) {
fprintf(stderr, "Connect failed!n");
close(fd);
break;
}
res = send_ex(fd, (const char *) buff, sizeof(buff), frags);
if ( res != sizeof(buff) ) {
fprintf(stderr, "Send failed!n");
close(fd);
break;
}
ctx->sent_packs++;
res = close(fd);
if ( res < 0 ) {
fprintf(stderr, "CLI: Close Failed!!n");
}
msleep(delay);
}
return NULL;
}
Server side run thread on each incoming connection that do following:
- read data from connected socket until it read all 16 bytes
- after reading at least first 4 bytes it is checked that this bytes is equal to predefined pattern.
typedef struct client_ctx_s {
struct sockaddr_in addr;
int fd;
} client_ctx_t;
void *client_task(void *arg)
{
client_ctx_t *client = (client_ctx_t *) arg;
size_t free_space, pos;
ssize_t chunk_len;
uint32_t buff[4] = {0};
int res;
pos = 0;
while ( pos != sizeof(buff) )
{
free_space = sizeof(buff) - pos;
assert(pos < sizeof(buff));
chunk_len = recv(client->fd, &((uint8_t *) buff)[pos], free_space, 0);
if ( chunk_len <= 0 ) {
if ( chunk_len < 0 ) {
fprintf(stderr, "%s:%u: ERROR: recv failed (errno = %d; pos = %zu)!n",
inet_ntoa(client->addr.sin_addr),
ntohs(client->addr.sin_port),
errno, pos);
}
else if ( pos && pos < sizeof(buff) ) {
fprintf(stderr, "%s:%u: ERROR: incomplete data block (pos = %zu)!n",
inet_ntoa(client->addr.sin_addr),
ntohs(client->addr.sin_port),
pos);
}
goto out;
}
assert(chunk_len <= free_space);
pos += chunk_len;
if ( pos >= 4 && buff[0] != 0xAA55AA55) {
fprintf(stderr, "%s:%u: ERROR: data corrupted (%08x)!n",
inet_ntoa(client->addr.sin_addr),
ntohs(client->addr.sin_port),
buff[0]);
}
}
fprintf(stdout, "%s:%u: %08x %08x %08x %08xn",
inet_ntoa(client->addr.sin_addr),
ntohs(client->addr.sin_port),
buff[0], buff[1], buff[2], buff[3]);
out:
debug("Connection closedn");
res = close(client->fd);
assert(res == 0);
free(client);
return NULL;
}
Issues that came up when client run one thousand of sending threads and each of them repeat connect-send-disconnect one hundred times (./client -t 1000 -c 100 -d 0 -f
):
- Loss of first bytes of pattern that was send.
- Total size of data that was readed from socket accordingly less that 16 bytes.
This behaviour is repeatable both on local host and over real network connection.
Examing TCP flow of corrupted data with wireshark show that:
- There no issue on client side.
- Corrupted data corresponds data that carried with retransmitted segments of data.
I can’t really believe this problem lies in the Linux TCP/IP implementation.
Can anybody explain what is wrong with my code?
2
Answers
I have the same behavior (if client runs with
-f
[--fragments
] key) withpython3
server implementation and original client inC
. And only sequence begin (1st chunk?) is always lost.In short: I think
SYN-cookies
are the root of problemI don’t know if the resulting behavior ("broken" first
recv
call afteraccept
) is a kernel bug. As far as I understand,SYN-cookies
feature allows the client to be "accepted" without him noticing anything, but it can create problems for the server application. However, I don’t have a definitive answer as to why this behavior is included in the default policy. Your client seems like network attacker =).Possibly useful link: https://access.redhat.com/solutions/30453
I found it rather strange that only the head of the sequence disappears… Next I checked
dmesg
and saw this:Next I disable it (as far as I know, this is not recommended in production):
After this, "data corruption" disappeared.
The following change in
server.c:114
also fixes it:https://man7.org/linux/man-pages/man2/listen.2.html
Additionally, updating the
client
logic with usingMSG_MORE
feature (linux-only, https://man7.org/linux/man-pages/man2/sendto.2.html) works around the problem because it reduces the load, allowing the server to "catch" begin of data. Which leads to strange thoughts that there is a bug in this operating system protection mechanism.I don’t understand why in case of server overload the default behavior is not "reject connection requests until not ready".