How to find all occurrences of a substring?(如何找到所有出现的子字符串?)
问题描述
Python 有 string.find()
和 string.rfind()
来获取字符串中子串的索引.
Python has string.find()
and string.rfind()
to get the index of a substring in a string.
我想知道是否有类似 string.find_all()
的东西可以返回所有找到的索引(不仅是从头开始的第一个索引,也不是从最后的第一个索引).
I'm wondering whether there is something like string.find_all()
which can return all found indexes (not only the first from the beginning or the first from the end).
例如:
string = "test test test test"
print string.find('test') # 0
print string.rfind('test') # 15
#this is the goal
print string.find_all('test') # [0,5,10,15]
推荐答案
没有简单的内置字符串函数可以满足您的需求,但您可以使用更强大的 正则表达式:
There is no simple built-in string function that does what you're looking for, but you could use the more powerful regular expressions:
import re
[m.start() for m in re.finditer('test', 'test test test test')]
#[0, 5, 10, 15]
如果您想查找重叠匹配,lookahead 会这样做:
If you want to find overlapping matches, lookahead will do that:
[m.start() for m in re.finditer('(?=tt)', 'ttt')]
#[0, 1]
如果你想要一个没有重叠的反向查找,你可以将正负前瞻组合成这样的表达式:
If you want a reverse find-all without overlaps, you can combine positive and negative lookahead into an expression like this:
search = 'tt'
[m.start() for m in re.finditer('(?=%s)(?!.{1,%d}%s)' % (search, len(search)-1, search), 'ttt')]
#[1]
re.finditer
返回一个generator,所以你可以把上面的 []
改成 ()
来获得一个生成器而不是一个列表,如果你只迭代一次结果,这将更有效.
re.finditer
returns a generator, so you could change the []
in the above to ()
to get a generator instead of a list which will be more efficient if you're only iterating through the results once.
这篇关于如何找到所有出现的子字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:如何找到所有出现的子字符串?


- python check_output 失败,退出状态为 1,但 Popen 适用于相同的命令 2022-01-01
- 沿轴计算直方图 2022-01-01
- 如何在 python3 中将 OrderedDict 转换为常规字典 2022-01-01
- 分析异常:路径不存在:dbfs:/databricks/python/lib/python3.7/site-packages/sampleFolder/data; 2022-01-01
- pytorch 中的自适应池是如何工作的? 2022-07-12
- 如何将一个类的函数分成多个文件? 2022-01-01
- 如何在 Python 的元组列表中对每个元组中的第一个值求和? 2022-01-01
- python-m http.server 443--使用SSL? 2022-01-01
- padding='same' 转换为 PyTorch padding=# 2022-01-01
- 使用Heroku上托管的Selenium登录Instagram时,找不到元素';用户名'; 2022-01-01