Can someone point me in the right direction on this..
I have a string that contains sentences of words
e.g. ‘He was trying to learn a pythonic, or regex, way of solving his problem’
The string in question is quite large and I need to break it up into multiple lines, where each line can not exceed 64 characters.
BUT I cant just insert a line break every 64 characters. I need to ensure the break occurs at the closest character (from a set of characters) before the 64th character, to ensure the line does not exceed 64 characters.
e.g. I can only insert a line break after a space, comma or full stop
I also need the solution to be quite efficient as it is an action that will occur many, many times.
Using textwrap
I’m not sure textwrap is the way to go for my problem because I need to preserve the original line breaks in the input string.
Example:
long_str = """
123456789 123456789 123456789 123456789 123456789 123456789
Line 1: Artificial intelligence (AI), sometimes called machine intelligence,
Line 2: is intelligence demonstrated by machines,
Line 3: in contrast to the natural intelligence displayed by humans and other animals.
Line 4: In computer science AI research is defined as
"""
lines = textwrap.wrap(long_str, 60, break_long_words=False)
print('n'.join(lines))
What I want is this:
123456789 123456789 123456789 123456789 123456789 123456789 Line 1: Artificial intelligence (AI), sometimes called machine intelligence, Line 2: is intelligence demonstrated by machines, Line 3: in contrast to the natural intelligence displayed by humans and other animals. Line 4: In computer science AI research is defined as
But textwrap gives me this:
123456789 123456789 123456789 123456789 123456789 123456789 Line 1: Artificial intelligence (AI), sometimes called machine intelligence, Line 2: is intelligence demonstrated by machines, Line 3: in contrast to the natural intelligence displayed by humans and other animals. Line 4: In computer science AI research is defined as
I suspect that Regex is probably the answer but I’m out of my depth trying to solve this with regex.
3
Answers
It might help us answer your question if you could provide any code you have already attempted. That being said, I believe the following example code will preserve existing line breaks, wrap lines exceeding 64 characters and preserve the format of the rest of the string.
The output from Python is:
with the iterating over the lines of a string per this
Split the long string into separate lines on the newline. Wrap each separate line “as usual”, then concatenate everything again to a single string.
As
textwrap
returns a list of strings, you don’t need to do anything else than keep on pasting them together, and join them at the end.