Python和sendfile[通俗易懂]

Python和sendfile[通俗易懂]sendfile(2)isaUNIXsystemcallwhichprovidesa“zero-copy”wayofcopyingdatafromonefiledescriptor(afile)toanother(asocket).Becausethiscopyingisdoneentirelywithinthekernel,sen…

大家好,又见面了,我是你们的朋友全栈君。

sendfile(2) is a UNIX system call which provides a “zero-copy” way of copying data from one file descriptor (a file) to another (a socket). Because this copying is done entirely within the kernel, sendfile(2) is more efficient than the combination of “file.read()” and “socket.send()”, which requires transferring data to and from user space.  This copying of the data twice imposes some performance and resource penalties which sendfile(2) syscall avoids; it also results in a single system call (and thus only one context switch), rather than the series of read(2) / write(2) system calls (each system call requiring a context switch) used internally for the data copying. A more exhaustive explanation of how sendfile(2) works is available here,  but long story short is that sending a file with sendfile() is usually twice as fast than using plain socket.send(). Typical applications which can benefit from using sendfile() are FTP and HTTP servers.

sendfile(2)是UNIX系统调用,它提供“零复制”方式将数据从一个文件描述符(一个文件)复制到另一个文件描述符(一个套接字)。 由于此复制完全在内核中完成,因此sendfile(2)比“ file.read()”和“ socket.send()”的组合更有效,后者需要在用户空间之间来回传输数据。 两次复制数据会造成一些性能和资源损失,这是sendcall(2)syscall避免的。 它还导致单个系统调用(因此只有一个上下文切换),而不是内部用于数据复制的一系列read(2) / write(2)系统调用(每个系统调用都需要上下文切换)。 关于sendfile(2)的工作原理的更详尽的解释可以在这里找到 ,但是长话短说,使用sendfile()发送文件通常比使用普通socket.send() 快两倍 。 可以从使用sendfile()中受益的典型应用是FTP和HTTP服务器。

socket.sendfile() ( socket.sendfile())

I recently contributed a patch for Python’s socket module which adds a high-level socket.sendfile() method (see full discussion at issue 17552). socket.sendfile() will transmit a file until EOF is reached by attempting to use os.sendfile(), if available, else it falls back on using plain socket.send(). Internally, it takes care of handling socket timeouts and provides two optional parameters to move the file offset or to send only a limited amount of bytes. I came up with this idea because getting all of that right is a bit tricky, so a generic wrapper seemed to be convenient to have. socket.sendfile() will make its appearance in Python 3.5.

我最近为Python的套接字模块提供了一个补丁,该补丁添加了一个高级socket.sendfile()方法(请参阅问题17552的完整讨论)。 socket.sendfile()将一直传输文件,直到尝试通过使用os.sendfile()达到EOF 为止(如果可用),否则将使用普通的socket.send()退回。 在内部,它负责处理套接字超时,并提供两个可选参数来移动文件偏移或仅发送有限数量的字节。 我想出了这个主意,是因为要正确地使用所有这些技巧有些棘手,因此使用通用包装器似乎很方便。 socket.sendfile()将在Python 3.5中显示。

sendfile和Python ( sendfile and Python)

sendfile(2) made its first appearance into the Python stdlib kind of late: Python 3.3. It was contributed by Ross Lagerwall and me in issue 10882. Since the patch didn’t make it into python 2.X and I wanted to use sendfile() in pyftpdlib I later decided to release it as a stand alone module working with older (2.5+) Python versions (see pysendfile project). Starting with version 3.5, Python will hopefully start using sendfile() more extensively, in details:

sendfile(2)首次出现在Python stdlib中:Python 3.3。 它由Ross Lagerwall和我在发行10882中贡献 。 由于该修补程序并未进入python 2.X,并且我想在pyftpdlib中使用sendfile(),因此我后来决定将其作为独立模块发布,与较旧的(2.5+)Python版本一起使用(请参阅pysendfile项目)。 从3.5版开始,Python有望开始更广泛地使用sendfile(),详细信息:

TransmitFile. Now that socket.sendfile() is in place it seems natural to add support for it as well (see
TransmitFile 。 现在socket.sendfile()就位了,添加它的支持似乎也很自然(请参阅
issue 21721).
发行21721 )。

反向移植到Python 2.6和2.7 ( Backport to Python 2.6 and 2.7)

For those of you who are interested in using socket.sendfile() with older Python 2.6 and 2.7 versions here’s a backport. It requires pysendfile modules to be installed. Full code including tests is hosted here.

对于那些对将Socket.sendfile()与旧版2.6和2.7版本一起使用感兴趣的人,这里提供了一个backport。 它需要安装pysendfile模块。 完整的代码(包括测试)位于此处

#!/usr/bin/env python

"""
This is a backport of socket.sendfile() for Python 2.6 and 2.7.
socket.sendfile() will be included in Python 3.5:
http://bugs.python.org/issue17552
Usage:

>>> import socket
>>> file = open("somefile.bin", "rb")
>>> sock = socket.create_connection(("localhost", 8021))
>>> sendfile(sock, file)
42319283
>>>
"""

import errno
import io
import os
import select
import socket
try:
    memoryview  # py 2.7 only
except NameError:
    memoryview = lambda x: x

if os.name == 'posix':
    import sendfile as pysendfile  # requires "pip install pysendfile"
else:
    pysendfile = None


_RETRY = frozenset((errno.EAGAIN, errno.EALREADY, errno.EWOULDBLOCK,
                    errno.EINPROGRESS))


class _GiveupOnSendfile(Exception):
    pass


if pysendfile is not None:

    def _sendfile_use_sendfile(sock, file, offset=0, count=None):
        _check_sendfile_params(sock, file, offset, count)
        sockno = sock.fileno()
        try:
            fileno = file.fileno()
        except (AttributeError, io.UnsupportedOperation) as err:
            raise _GiveupOnSendfile(err)  # not a regular file
        try:
            fsize = os.fstat(fileno).st_size
        except OSError:
            raise _GiveupOnSendfile(err)  # not a regular file
        if not fsize:
            return 0  # empty file
        blocksize = fsize if not count else count

        timeout = sock.gettimeout()
        if timeout == 0:
            raise ValueError("non-blocking sockets are not supported")
        # poll/select have the advantage of not requiring any
        # extra file descriptor, contrarily to epoll/kqueue
        # (also, they require a single syscall).
        if hasattr(select, 'poll'):
            if timeout is not None:
                timeout *= 1000
            pollster = select.poll()
            pollster.register(sockno, select.POLLOUT)

            def wait_for_fd():
                if pollster.poll(timeout) == []:
                    raise socket._socket.timeout('timed out')
        else:
            # call select() once in order to solicit ValueError in
            # case we run out of fds
            try:
                select.select([], [sockno], [], 0)
            except ValueError:
                raise _GiveupOnSendfile(err)

            def wait_for_fd():
                fds = select.select([], [sockno], [], timeout)
                if fds == ([], [], []):
                    raise socket._socket.timeout('timed out')

        total_sent = 0
        # localize variable access to minimize overhead
        os_sendfile = pysendfile.sendfile
        try:
            while True:
                if timeout:
                    wait_for_fd()
                if count:
                    blocksize = count - total_sent
                    if blocksize <= 0:
                        break
                try:
                    sent = os_sendfile(sockno, fileno, offset, blocksize)
                except OSError as err:
                    if err.errno in _RETRY:
                        # Block until the socket is ready to send some
                        # data; avoids hogging CPU resources.
                        wait_for_fd()
                    else:
                        if total_sent == 0:
                            # We can get here for different reasons, the main
                            # one being 'file' is not a regular mmap(2)-like
                            # file, in which case we'll fall back on using
                            # plain send().
                            raise _GiveupOnSendfile(err)
                        raise err
                else:
                    if sent == 0:
                        break  # EOF
                    offset += sent
                    total_sent += sent
            return total_sent
        finally:
            if total_sent > 0 and hasattr(file, 'seek'):
                file.seek(offset)
else:
    def _sendfile_use_sendfile(sock, file, offset=0, count=None):
        raise _GiveupOnSendfile(
            "sendfile() not available on this platform")


def _sendfile_use_send(sock, file, offset=0, count=None):
    _check_sendfile_params(sock, file, offset, count)
    if sock.gettimeout() == 0:
        raise ValueError("non-blocking sockets are not supported")
    if offset:
        file.seek(offset)
    blocksize = min(count, 8192) if count else 8192
    total_sent = 0
    # localize variable access to minimize overhead
    file_read = file.read
    sock_send = sock.send
    try:
        while True:
            if count:
                blocksize = min(count - total_sent, blocksize)
                if blocksize <= 0:
                    break
            data = memoryview(file_read(blocksize))
            if not data:
                break  # EOF
            while True:
                try:
                    sent = sock_send(data)
                except OSError as err:
                    if err.errno in _RETRY:
                        continue
                    raise
                else:
                    total_sent += sent
                    if sent < len(data):
                        data = data[sent:]
                    else:
                        break
        return total_sent
    finally:
        if total_sent > 0 and hasattr(file, 'seek'):
            file.seek(offset + total_sent)


def _check_sendfile_params(sock, file, offset, count):
    if 'b' not in getattr(file, 'mode', 'b'):
        raise ValueError("file should be opened in binary mode")
    if not sock.type & socket.SOCK_STREAM:
        raise ValueError("only SOCK_STREAM type sockets are supported")
    if count is not None:
        if not isinstance(count, int):
            raise TypeError(
                "count must be a positive integer (got %s)" % repr(count))
        if count <= 0:
            raise ValueError(
                "count must be a positive integer (got %s)" % repr(count))


def sendfile(sock, file, offset=0, count=None):
    """sendfile(sock, file[, offset[, count]]) -> sent

    Send a *file* over a connected socket *sock* until EOF is
    reached by using high-performance sendfile(2) and return the
    total number of bytes which were sent.
    *file* must be a regular file object opened in binary mode.
    If sendfile() is not available (e.g. Windows) or file is
    not a regular file socket.send() will be used instead.
    *offset* tells from where to start reading the file.
    If specified, *count* is the total number of bytes to transmit
    as opposed to sending the file until EOF is reached.
    File position is updated on return or also in case of error in
    which case file.tell() can be used to figure out the number of
    bytes which were sent.
    The socket must be of SOCK_STREAM type.
    Non-blocking sockets are not supported.
    """
    try:
        return _sendfile_use_sendfile(sock, file, offset, count)
    except _GiveupOnSendfile:
        return _sendfile_use_send(sock, file, offset, count)


翻译自: https://www.pybloggers.com/2014/06/python-and-sendfile/

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/135048.html原文链接:https://javaforall.cn

【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛

【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...

(0)


相关推荐

  • mysql 添加索引卡死_mysql添加索引,查询反而变慢

    mysql 添加索引卡死_mysql添加索引,查询反而变慢依照楼主的数据,我也造了400万数据:mysql>select*fromindex_testlimit5;id1id211111111112222222222111111111122222222221111111111id1创建索引执行确实是id2谓词条件比较快:mysql>select*fromindex_testwhereid1=11111;2097152…

  • 用 Java 实现拦截器 Interceptor 的拦截功能

    用 Java 实现拦截器 Interceptor 的拦截功能Java里的拦截器是动态拦截action调用的对象。它提供了一种机制可以使开发者可以定义在一个action执行的前后执行的代码,也可以在一个action执行前阻止其执行,同时也提供了一种可以提取action中可重用部分的方式。在AOP(Aspect-OrientedProgramming)中拦截器用于在某个方法或字段被访问之前进行拦截,然后在之前或之后加入某些操作。  此外,拦截

  • OpenCv调用摄像头拍照代码

    OpenCv调用摄像头拍照代码

  • Java酒店管理系统_java酒店管理系统报告

    Java酒店管理系统_java酒店管理系统报告基于jsp+servlet+pojo+mysql实现一个javaee/javaweb的小型酒店管理系统,该项目可用各类java课程设计大作业中,小型酒店管理系统的系统架构分为前后台两部分,最终实现在线上进行小型酒店管理系统各项功能,实现了诸如用户管理,登录注册,权限管理等功能,并实现对各类小型酒店管理系统相关的实体进行管理。该小型酒店管理系统为一个采用mvc设计模式进行开发B/S架构项…

  • 怎么进行大数据测试?我们需要具备怎样的测试能力?「建议收藏」

    怎么进行大数据测试?我们需要具备怎样的测试能力?「建议收藏」前言:现在大数据这么火,那么作为测试人员,我们应该怎么进行大数据测试?需要具备怎样的测试能力?一、大数据测试实现被分成三个步骤(1):数据阶段验证大数据测试的第一步,也称作pre-hadoop阶段该过程包括如下验证:1、来自各方面的数据资源应该被验证,来确保正确的数据被加载进系统2、将源数据与推送到Hadoop系统中的数据进行比较,以确保它们匹配3、验证正确的数据被提取并被加载到HDFS正确的位置该阶段可以使用工具Talend或Datameer,进行数据阶段验证。(2):”MapReduc

  • 5个最佳拖放式WordPress网页生成器比较(2018)

    5个最佳拖放式WordPress网页生成器比较(2018)你想要一个简单的方法来建立和定制你的WordPress网站?这就是拖放WordPress网页生成器插件派上用场的地方。这些WordPress网页生成器允许您在不编写任何代码的情况下创建、编辑和自定义您的网站布局。在本文中,我们将比较和回顾5个最好的WordPress拖放网页构建器。为什么使用拖放页面生成器的WordPress?当开始一个博客时,许多WordPress初学者发现很难在他们的网站上更改或自定义页面布局。虽然很多优质的WordPress主题都有不同的页面布局,但对于不懂HTML代码的人来说,

发表回复

您的电子邮箱地址不会被公开。

关注全栈程序员社区公众号