How to use tee and splice
 
PROGRAMMING
How to use tee and splice
2017-03-02 | by David "DeMO" Martínez Oliveira

I read about tee and splice long ago. I quickly went through the man pages and, like in the Matrix movie I said to myself.. "I know tee and splice". Recently, I have to write some code that perfectly matched the functionality of these two system calls... but when I tried them.
Yes, I went straight-forward and teeed and spliced my file descriptors and... the thing not even compiled. It was time to read again the man pages to figure out what was the problem.

As usual, there was no real problem, it was just me. I didn't really read what the man page said. So, in case you would even need to use these syscall here is how to do that.

Magic does not exists

Despite of what some companies with fruit logos keep saying, magic just does not exist. At least, it does not exist in the world of computers.

When you first look at tee and splice they look like some kind of advanced wizardry is performed in the kernel copying buffers around and magically transfer the data from some file descriptors to others. In a sense that is what happen but with some limitation, those buffers cannot not be anything.

Actually, we have to use pipes in order to let all this kernel code work. In short, the reason is that we do need a buffer to hold the data, even in the kernel, an a pipe, can be seen as a kernel buffer in a sense. Go through these messages from the kernel development list for more details.

http://yarchive.net/comp/linux/splice.html

Therefore, in order to use tee and splice we have to create pipes to hold our data.

The tee Tool re-implemented

When researching on this topic I looked for the source code of the tee tool. I was expecting it to use tee and splice, however, what I found was the straight-forward implementation reading from some file descriptors and writing to others.

So I decide to re-implement it using these system calls, just for fun, and this is the code I come up with:

#define _GNU_SOURCE         /* See feature_test_macros(7) */
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>

#include <stdio.h>
#include <stdlib.h>

int
main (int argc, char *argv[])
{
  int fd_in = 0; // stdin
  int fd_out = 1; //
  int fd_file;
  int pipe1[2], pipe2[2];
  int n, n1;

  if (argc != 2)
    {
      fprintf (stderr, "Usage: %s file\n\n", argv[0]);
      return -1;
    }
  if ((fd_file = open (argv[1], O_WRONLY | O_CREAT | O_TRUNC, 0666)) < 0)
    {
      perror ("open:");
      return -1;
    }
  if (pipe (pipe1) < 0)
    {
      perror ("pipe1:");
      return -1;
    }
  if (pipe (pipe2) < 0)
    {
      perror ("pipe2:");
      return -1;
    }

  while (1)
    {
      // Copy stdin into the pipe
      if ((n = splice (fd_in, NULL, pipe1[1], NULL, 1024, 
		       SPLICE_F_MOVE | SPLICE_F_MORE)) < 0)
	{
	  perror ("splice:");
	  return -1;
	}
      if (n == 0)
	{
	  fprintf (stderr, "DONE");
	  close (fd_file);
	  return 0;
	}
      // Duplicate pipe1 into pipe2 without consuming data
      // Now we have the same data into pipe1 and pipe2
      if ((n1 = tee (pipe1[0], pipe2[1], n, 0)) < 0)
	{
	  perror ("tee:");
	  close (fd_file);
	  return 1;
	}

      if ((splice (pipe1[0], NULL, fd_out, NULL, n, 
      	 	   SPLICE_F_MOVE | SPLICE_F_MORE)) < 0)
	{
	  perror ("splice1:");
	  return -1;
	}
      if ((splice (pipe2[0], NULL, fd_file, NULL, n, 
      	           SPLICE_F_MOVE | SPLICE_F_MORE)) < 0)
	{
	  perror ("splice1:");
	  return -1;
	}
    }
  close (fd_out);
}

The code is self-explanatory. The only thing you need to know is that tee only works on pipes and splice expects the first parameter to be a pipe.

Conclusion

It was interesting to figure out how to use these system call. I made some performance tests comparing the system tee command, and my implementation and I couldn't find any real performance difference between them.

As a final note, I checked the code above in the main scenarios and it always worked for me. However, and this is something I don't know, in the case that either tee or splice returns copy less bytes that the one we have requested, the code above will fail. In my tests it never happened but, as that is the default behaviour we find in the read system call, I would expect that same behaviour here.

I haven't write the code for that case to keep this as a simple and neat example on how to use the system calls.

 
Tu publicidad aquí :)