I have a program that require all keywords to be in a single paragraph, most of the time, separated by commas

For example:

I have those terms

1-Term
1.1-Term
2-Term
3-Term
4-Term

That i collected and organized into groups and subgroups with Titles and subtitles

Title

  • 1-Term

  • 1.1-Term

  • 2-Term

    • Sub-Title
      • 3-Term
      • 4-Term

But then i want to turn them into:

1-Term, 1.1-Term, 2-Term, 3-Term, 4-Term 
 

Removing certain marked words(Titles and sub-Titles), any Empty/Blank space, and Line breaks, while adding the commas between The Terms. I want to keep certain dashes “-”(like in words )

1-Term,1.1 -Term,2-Term,3-Term,4-Term

  • bus_factor@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    2 hours ago

    If you can’t install a dedicated tool like yq but don’t mind creating a standalone script, python would be able to do this out of the box on pretty much any computer, calculator or toaster you can get your hands on in 2026:

    #! /usr/bin/env python3
    
    import yaml
    import sys
    
    def parse_yaml(filename):
        with open(filename) as fd:
            return yaml.safe_load(fd)
    
    def get_leaf_nodes(data_iterable):
        output = []
        for v in data_iterable:
            if isinstance(v, dict):
                output += get_leaf_nodes(v.values())
            elif isinstance(v, list):
                output += get_leaf_nodes(v)
            else:
                output.append(v)
        return output
    
    print(",".join(get_leaf_nodes(parse_yaml(sys.argv[1]))))
    
    $ /tmp/foo.py /tmp/foo.txt
    Harry potter,Perfect Blue,Jurassic world,Jurassic Park,Jedi,Star wars,The clone wars,MCU,Gumball,Flapjack,Steven Universe,Stars vs. the forces of Evil,Wordgril,Flapjack
    

    This takes the first argument on the command line, parses it as yaml, finds all leaf nodes recursively, and prints a comma-separated list of the results.