Show HN: Textcase - 用于文本格式转换的 Python 库
Navigation Menu
Toggle navigation Sign in
- Product
- GitHub Copilot Write better code with AI
- GitHub Advanced Security Find and fix vulnerabilities
- Actions Automate any workflow
- Codespaces Instant dev environments
- Issues Plan and track work
- Code Review Manage code changes
- Discussions Collaborate outside of code
- Code Search Find more, search less Explore
- All features
- Documentation
- GitHub Skills
- Blog
- Solutions By company size
- Resources Topics
- Open Source
- Enterprise
- Pricing
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search Clear Search syntax tips
Provide feedback
We read every piece of feedback, and take your input very seriously. Include my email address so I can be contacted Cancel Submit feedback
Saved searches
Use saved searches to filter your results more quickly
Name Query To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up Reseting focus You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} zobweyt / textcase Public
- Notifications You must be signed in to change notification settings
- Fork 0
- Star 15
一个功能完备的 Python 文本格式转换库 pypi.org/project/textcase/
License
GPL-3.0 license 15 stars 0 forks Branches Tags Activity Star Notifications You must be signed in to change notification settings
Additional navigation options
zobweyt/textcase
main BranchesTags Go to file Code
Folders and files
Name| Name| Last commit message| Last commit date ---|---|---|---
Latest commit
History
43 Commits .github/workflows| .github/workflows tests| tests textcase| textcase .gitignore| .gitignore .python-version| .python-version CHANGELOG.md| CHANGELOG.md CODE_OF_CONDUCT.md| CODE_OF_CONDUCT.md CONTRIBUTING.md| CONTRIBUTING.md LICENSE.md| LICENSE.md README.md| README.md cliff.toml| cliff.toml justfile| justfile pyproject.toml| pyproject.toml shell.nix| shell.nix uv.lock| uv.lock View all files
Repository files navigation
textcase
A feature complete Python text case conversion library.
Installation
pip install textcase
Example
你可以使用 textcase.convert
函数将字符串转换为指定的格式:
from textcase import case, convert
print(convert("ronnie james dio", case.SNAKE)) # ronnie_james_dio
print(convert("Ronnie_James_dio", case.CONSTANT)) # RONNIE_JAMES_DIO
print(convert("RONNIE_JAMES_DIO", case.KEBAB)) # ronnie-james-dio
print(convert("RONNIE-JAMES-DIO", case.CAMEL)) # ronnieJamesDio
print(convert("ronnie-james-dio", case.PASCAL)) # RonnieJamesDio
print(convert("RONNIE JAMES DIO", case.LOWER)) # ronnie james dio
print(convert("ronnie james dio", case.UPPER)) # RONNIE JAMES DIO
print(convert("ronnie-james-dio", case.TITLE)) # Ronnie James Dio
print(convert("ronnie james dio", case.SENTENCE)) # Ronnie james dio
默认情况下,textcase.convert
和 textcase.converter.CaseConverter.convert
将基于以下默认单词边界进行分割:
- 下划线
_
, - 连字符
-
, - 空格 ,
- 从小写到大写的字符大小写变化
aA
, - 相邻的数字和字母
a1
,1a
,A1
,1A
, - 以及首字母缩略词
AAa
(例如HTTPRequest
).
为了更精确地控制,您可以指定基于特定格式的单词边界进行分割。 例如,从 snake case 格式分割只会使用下划线作为单词边界:
from textcase import boundary, case, convert
print(convert("2020-04-16_my_cat_cali", case.TITLE)) # 2020 04 16 My Cat Cali
print(convert("2020-04-16_my_cat_cali", case.TITLE, (boundary.UNDERSCORE,))) # 2020-04-16 My Cat Cali
该库可以检测 camel-like 字符串中的首字母缩略词。 它还会忽略任何前导、尾随或重复的分隔符:
from textcase import case, convert
print(convert("IOStream", case.SNAKE)) # io_stream
print(convert("myJSONParser", case.SNAKE)) # my_json_parser
print(convert("__weird--var _name-", case.SNAKE)) # weird_var_name
它也适用于非 ASCII 字符。 但是,不会对语言本身进行推断。 例如,荷兰语中的双字母 ij 不会被大写,因为它表示为两个不同的 Unicode 字符。 但是,æ 将被大写:
from textcase import case, convert
print(convert("GranatÄpfel", case.KEBAB)) # granat-äpfel
print(convert("ПЕРСПЕКТИВА24", case.TITLE)) # Перспектива 24
print(convert("ὈΔΥΣΣΕΎΣ", case.LOWER)) # ὀδυσσεύς
默认情况下,后跟数字的字符以及反之亦然的字符被认为是单词边界。 此外,任何特殊的 ASCII 字符(除了 _
和 -
)都会被忽略:
from textcase import case, convert
print(convert("E5150", case.SNAKE)) # e_5150
print(convert("10,000Days", case.SNAKE)) # 10,000_days
print(convert("Hello, world!", case.UPPER)) # HELLO, WORLD!
print(convert("ONE\nTWO\nTHREE", case.TITLE)) # One\ntwo\nthree
你也可以测试一个字符串是什么格式:
from textcase import case, is_case
print(is_case("css-class-name", case.KEBAB)) # True
print(is_case("css-class-name", case.SNAKE)) # False
print(is_case("UPPER_CASE_VAR", case.SNAKE)) # False
Boundary Specificity
很难确定如何将字符串分割成单词。 这就是为什么这个库提供了 textcase.convert
和 textcase.converter.CaseConverter.convert
功能,但有时这不足以满足特定的用例。
假设一个标识符包含单词 2D
,例如 scale2D
。 单独使用 textcase.convert
或 textcase.converter.CaseConverter.convert
不足以解决问题。 在这种情况下,我们可以进一步指定要分割字符串的边界。 这个库提供了一些模式来实现这种特异性。 我们可以使用 textcase.boundary.Boundary
类的实例来指定我们想要分割的边界:
from textcase import boundary, case, convert
# Not quite what we want
print(convert("scale2D", case.SNAKE, case.CAMEL.boundaries)) # scale_2_d
# Write boundaries explicitly
print(convert("scale2D", case.SNAKE, (boundary.LOWER_DIGIT,))) # scale_2d
Custom Boundaries
这个库提供了一些与常见格式相关的边界常量。 但是你可以创建自己的边界来分割其他条件:
from textcase import case, convert
from textcase.boundary import Boundary
# Not quite what we want
print(convert("coolers.revenge", case.TITLE)) # Coolers.revenge
# Define custom boundary
DOT = Boundary(
satisfies=lambda text: text.startswith("."),
length=1,
)
print(convert("coolers.revenge", case.TITLE, (DOT,))) # Coolers Revenge
# Define complex custom boundary
AT_LETTER = Boundary(
satisfies=lambda text: (len(text) > 1 and text[0] == "@") and (text[1] == text[1].lower()),
start=1,
length=0,
)
print(convert("name@domain", case.TITLE, (AT_LETTER,))) # Name@ Domain
要了解更多关于从头开始构建边界的信息,请查看 textcase.boundary.Boundary
类。
Custom Case
与 textcase.boundary.Boundary
类似,textcase.case.Case
公开了大小写转换所需的三个组件。 这允许你定义一个自定义的大小写格式,该格式在 textcase.convert
和 textcase.converter.CaseConverter.convert
函数中表现得当:
from textcase import convert
from textcase.boundary import Boundary
from textcase.case import Case
from textcase.pattern import lower
# Define custom boundary
DOT = Boundary(
satisfies=lambda text: text.startswith("."),
length=1,
)
# Define custom case
DOT_CASE = Case(
boundaries=(DOT,),
pattern=lower,
delimiter=".",
)
print(convert("Dot case var", DOT_CASE)) # dot.case.var
并且因为我们定义了边界条件,这意味着 textcase.is_case
也应该按预期运行:
from textcase import is_case
from textcase.boundary import Boundary
from textcase.case import Case
from textcase.pattern import lower
# Define custom boundary
DOT = Boundary(
satisfies=lambda text: text.startswith("."),
length=1,
)
# Define custom case
DOT_CASE = Case(
boundaries=(DOT,),
pattern=lower,
delimiter=".",
)
print(is_case("dot.case.var", DOT_CASE)) # True
print(is_case("Dot case var", DOT_CASE)) # False
Case converter class
大小写转换分为两部分。 第一部分将标识符分割成一系列单词,第二部分将这些单词重新连接在一起。 每个步骤分别使用 textcase.converter.CaseConverter.from_case
和 textcase.converter.CaseConverter.to_case
函数定义。
CaseConverter
是一个类,它封装了用于分割的边界,以及用于改变和连接的模式和分隔符。 convert 方法将适当地应用边界、模式和分隔符。 这允许你预先定义大小写转换的参数:
from textcase import CaseConverter, case, pattern
converter = CaseConverter()
converter.pattern = pattern.camel
converter.delimiter = "_"
print(converter.convert("My Special Case")) # my_Special_Case
converter.from_case(case.CAMEL)
converter.to_case(case.SNAKE)
print(converter.convert("mySpecialCase")) # my_special_case
有关字符串如何转换的更多详细信息,请参阅 textcase.converter.CaseConverter
的文档。
API
Modules
textcase.boundary
| 用于将标识符分割成单词的条件。
textcase.case
| 用于文本转换的大小写格式定义