I have a program that require all keywords to be in a single paragraph, most of the time, separated by commas

For example:

I have those terms

1-Term
1.1-Term
2-Term
3-Term
4-Term

That i collected and organized into groups and subgroups with Titles and subtitles

Title

  • 1-Term

  • 1.1-Term

  • 2-Term

    • Sub-Title
      • 3-Term
      • 4-Term

But then i want to turn them into:

1-Term, 1.1-Term, 2-Term, 3-Term, 4-Term 
 

Removing certain marked words(Titles and sub-Titles), any Empty/Blank space, and Line breaks, while adding the commas between The Terms. I want to keep certain dashes “-”(like in words )

1-Term,1.1 -Term,2-Term,3-Term,4-Term

  • bus_factor@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    7 hours ago

    If you can stick to valid YAML like your example is, you can use a reasonably short yq command to get a comma-separated string of all scalar values:

    $ yq -r '[.. | scalars] | join(",")' /tmp/foo.txt                
    Harry potter,Perfect Blue,Jurassic world,Jurassic Park,Jedi,Star wars,The clone wars,MCU,Gumball,Flapjack,Steven Universe,Stars vs. the forces of Evil,Wordgril,Flapjack
    

    .. goes down the tree recursively, scalars filters out only scalar values, [] around those two makes them an array, and piping it all to join(",") makes it into a comma-separated string.