大家好,又见面了,我是你们的朋友全栈君。如果您正在找激活码,请点击查看最新教程,关注关注公众号 “全栈程序员社区” 获取激活教程,可能之前旧版本教程已经失效.最新Idea2022.1教程亲测有效,一键激活。
Jetbrains全系列IDE稳定放心使用
I’ve never been sure that I understand the difference between str/unicode decode and encode.
I know that str().decode() is for when you have a string of bytes that you know has a certain character encoding, given that encoding name it will return a unicode string.
I know that unicode().encode() converts unicode chars into a string of bytes according to a given encoding name.
But I don’t understand what str().encode() and unicode().decode() are for. Can anyone explain, and possibly also correct anything else I’ve gotten wrong above?
EDIT:
Several answers give info on what .encode does on a string, but no-one seems to know what .decode does for unicode.
解决方案
The decode method of unicode strings really doesn’t have any applications at all (unless you have some non-text data in a unicode string for some reason — see below). It is mainly there for historical reasons, i think. In Python 3 it is completely gone.
unicode().decode() will perform an implicit encoding of s using the default (ascii) codec. Verify this like so:
>>> s = u’ö’
>>> s.decode()
Traceback (most recent call last):
File “”, line 1, in
UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xf6′ in position 0:
ordinal not in range(128)
>>> s.encode(‘ascii’)
Traceback (most recent call last):
File “”, line 1, in
UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xf6′ in position 0:
ordinal not in range(128)
The error messages are exactly the same.
For str().encode() it’s the other way around — it attempts an implicit decoding of s with the default encoding:
>>> s = ‘ö’
>>> s.decode(‘utf-8’)
u’\xf6′
>>> s.encode()
Traceback (most recent call last):
File “”, line 1, in
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 0:
ordinal not in range(128)
Used like this, str().encode() is also superfluous.
But there is another application of the latter method that is useful: there are encodings that have nothing to do with character sets, and thus can be applied to 8-bit strings in a meaningful way:
>>> s.encode(‘zip’)
‘x\x9c;\xbc\r\x00\x02>\x01z’
You are right, though: the ambiguous usage of “encoding” for both these applications is… awkard. Again, with separate byte and string types in Python 3, this is no longer an issue.
发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/185136.html原文链接:https://javaforall.cn
【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛
【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...