czepeda.com

JSON to YAML to Markdown file Shell Script

Link: https://github.com/cazepeda/json-to-yaml-to-md-file

Tools Used:

  • Terminal
  • Visual Studio Code
  • jq
  • Shell Script

I currently have a pinboard.in account which will expire soon. I've had that account since the formation of pinboard back in 2008. It's been a great service but now that I've got my website going again. I thought I'd start importing all those bookmarks into my website.

Pinboard.in provides exporting the bookmarks in JSON, XML and HTML. If I want to import it into Statamic I'd need to convert that JSON data into YAML data and then make that document into Markdown. Luckily, JSON and YAML save their data in a key/value pair structure, e.g. key: value. For me this was a plus because my first thought was create a shell script to convert each JSON object into a separate YAML/Markdown file. Each Pinboard.in bookmark is saved into it a JSON object like so.

[
    {
        "href": "https:\/\/teamrv-mvp.sos.texas.gov\/MVP\/back2HomePage.do",
        "description": "My Voter Portal",
        "extended": "",
        "meta": "f37ee28f61e35f5db09c653a4d3c82ac",
        "hash": "99aba043d80c1b0765802fd7e6284a86",
        "time": "2024-01-31T16:18:56Z",
        "shared": "yes",
        "toread": "no",
        "tags": "texas voting"
    },
    {
        "href": "https:\/\/publicdomainreview.org\/collection\/flammarion-engraving\/",
        "description": "Wheels Within Wheels: The \u201cFlammarion Engraving\u201d (ca. 1888) \u2013 The Public Domain Review",
        "extended": "",
        "meta": "4b9ed2dd7330962794ca45fe901fba72",
        "hash": "b864adf2c9eb1f92ba1ad8aa43b23d8e",
        "time": "2024-01-30T06:22:42Z",
        "shared": "yes",
        "toread": "no",
        "tags": "art history"
    },
    {
        "href": "https:\/\/www.contemporist.com\/an-exterior-of-vertical-wood-siding-helps-this-house-blend-into-the-surrounding-forest\/",
        "description": "An Exterior Of Vertical Wood Siding Helps This House Blend Into The Surrounding Forest",
        "extended": "",
        "meta": "188364b64f99012a1ab86a4901bb2452",
        "hash": "b3e724498141b8426c05ae0933b98520",
        "time": "2024-01-26T15:21:10Z",
        "shared": "yes",
        "toread": "no",
        "tags": "home"
    }
]

First thing I tackled was extracting each object from JSON and convert it to a Markdown file. I'm using jq command-line JSON processor to format the data as needed. This part of the script checks first if jq is installed, find the .json file to work with, create a directory to create the markdown files into, use the .time and .description key property to use for naming the markdown file. I also had to format the JSON key/value pair into a YAML structure. This was easy enough with exception to the tags property to list the value's in a listed structure. I tackled this in the second part of the shell script. Originally I created all these parts of the shell script in separate scripts in order to keep testing simple and easy to manage.

#!/bin/bash

# CREATE MARKDOWN FILES FROM JSON

# Check if jq is installed
if ! command -v jq &> /dev/null; then
    echo "Please install jq before running this script."
    exit 1
fi

# Input JSON file
input_json="/path/to/file/test.json"

# Output directory for Markdown files
output_dir="output_markdown"

# Create output directory if it doesn't exist
mkdir -p "$output_dir"

# Extract each object from JSON and convert to Markdown
jq -c '.[]' "$input_json" | while read -r obj; do
    # Generate a unique filename based on some property of the JSON object
    # filename="$output_dir/$(echo "$obj" | jq -r '"\(.time).\(.description)"').md"
    filename="$output_dir/$(echo "$obj" | jq -r '"\(.time|tostring|fromdate|strftime("%Y-%m-%d")).\(.description | ascii_downcase | gsub("\""; "\""))" | gsub(" "; "-") | gsub("\\("; "") | gsub("\\)"; "") | gsub("\\\""; "") | gsub("“"; "") | gsub("”"; "") | gsub(":"; "")').md"
    
    # Convert JSON object to Markdown and save it to the file
    echo -e "$obj" | jq -r '["---"] + (to_entries | map("\(.key): \(.value)")) + ["---"] | join("\n")' > "$filename"

    echo "Converted and saved to: $filename"
done

# END CREATE MARKDOWN FILES FROM JSON

This is the shell script that handled converting the tags property value's to be listed. The reason I needed it to be listed was because in Statamic creates an index page listing all posts related to history or any tag that gets created out of the tags key property value.

This part of the script will create a variable for the directory where all the markdown files got created. Create a variable for a space that will be used to detect the different tags values. Check if the directory exists. Then go through each markdown file and format it in a YAML structure. It will detect the tags key property and detect the values and list them.

# FORMAT MARKDOWN FILES

# Directory containing Markdown files
directory="/path/to/directory/output_markdown"

# Separator used in the tags field
tags_separator=" "

# Check if the directory exists
if [ ! -d "$directory" ]; then
    echo "Directory not found: $directory"
    exit 1
fi

# Process each Markdown file in the directory
for file in "$directory"/*.md; do
    # Check if the file is a regular file
    if [ -f "$file" ]; then
        # Process the Markdown file and create a temporary file
        temp_file=$(mktemp)
        awk -v sep="$tags_separator" -F': ' '            
            $1 == "tags" {
                # Convert the values into a list format
                printf "%s:\n", $1
                n = split($2, tags, sep)
                for (i = 1; i <= n; i++) {
                    printf " - %s\n", tags[i]
                }
            }
            $1 != "tags" {            
                # For other lines, print as is
                print
            }
        ' "$file" > "$temp_file"

        # Replace the original file with the updated content
        mv "$temp_file" "$file"
        echo "Updated: $file"
    fi
done

# END FORMAT MARKDOWN FILES

Final part of this shell script was optional. But I wanted to wrap specific key property values in single quotes. First I define what values I want to target. Created a directory variable to locate the markdown files. Loop through the key's and append the single quotes to the value's that are specified in the target_keys variable.

# ADD SINGLE TICKS TO SPECIFIED KEYS

# Specify the keys for which values should be enclosed in single quotes
target_keys=("href" "description" "time")

# Directory containing Markdown-like files
directory="/path/to/directory/output_markdown"

# Loop through each file in the directory
for file in "$directory"/*.md; do
    # Check if the file is a regular file
    if [ -f "$file" ]; then
        # Loop through target keys and add single quotes around values
        for key in "${target_keys[@]}"; do
            sed -E "s/^$key: (.*)$/$key: '\1'/" "$file" > "$file.temp"
            mv "$file.temp" "$file"
        done

        echo "Single quotes added to specified keys in: $file"
    fi
done

echo "Update complete."

# END ADD SINGLE TICKS TO SPECIFIED KEYS

Last thing to do as you should with all shell scripts that you run, make it writeable! Do this by running the following code: chmod +x json-to-yaml-to-markdown-file.sh. You can check that is wrote the command as specified by running the following command to show the file permissions, you'll know its writeable by the x a the last part of the permissions: ls -la. e.g. -rwxr-xr-x 1 computerName staff 996 Feb 7 12:58 json-to-yaml-to-markdown-file.sh.

Once its writeable, run the script and watch the magic happen: ./json-to-yaml-to-markdown-file.sh.