I Wrote a Go Script to Generate Gists, Here's What I Learned

Ian Wilson - May 7 '18 - - Dev Community

There is a lot of busywork that comes with writing blogs and posting them to different platforms.

For example, I usually start writing an article by creating a markdown file within my code editor, inserting code snippets and images where need be.

However, problems arise when I need to post this same markdown to Medium. Although their text editor is pretty, it is not perfectly safe to copy and paste my markdown files there.

Here are some issues:

  • code words such as fields or someFunction wrapped in backticks are not formatted properly on paste
  • larger code blocks do not currently support syntax highlighting
  • words that are bold or italicized maintain their styling, but are not separated from the *s that wrap them in my markdown file

And here is a screenshot of those 3 bullet points pasted:

copypastemedium

Call me lazy, but after writing a 2500+ word post, I must:

  • go through each line and make sure my grammar makes sense.
  • assure my code examples work since some of my articles are tutorials.
  • make my code look pretty through Github Gists, thus bringing along with it a whole new issue of tabs vs spaces.
  • clean up every markdown-style hyperlink and remove excess asterisks or backticks from highlighted words and phrases.

This becomes draining. It becomes a lot of work to try to produce content that can be cross-posted to multiple sites.

Why dont I just cave in and post exclusively to dev.to where you literally just paste your markdown file?

Because that's not my nindō. I must become omnipresent throughout the internet.

Cleaning Code Snippets

Let's start by grabbing the lowest hanging fruit, that is, clean up those code snippets! How can we do this?

Well, the brute force way would be to copy and paste as normal but, remove the backticks denoting code blocks, press "CMD+option+6" to create a Medium Codeblock™, and then struggle with the awkward newlines and tabs within the Medium Codeblock™.

Nope. I'm not gonna do that again.

The next option is to manually create Github Gists, which delegates the code formatting to Github. In addition we also get syntax highlighting out of it.

However, in my most recent toils, the tab size for gists seems to default to 8, which is absolutely ridiculous.

Nobody uses size-8 tabs, Github.

It seems that the only way to fix this issue is by replacing all instances of tabs with spaces (2 in my case).

The benefit is we get prettier code snippets. The drawback is that every tab must be replaced by spaces. My previous article had 18 JavaScript code snippets. It was a miserable editing experience.

Introducing the Gist Generator

*I hope I come up with a cooler name sometime

I decided to take advantage of my dev chops to utilize the Github API in order to create these Gists. This script, written in Go, will proceed as follows:

  1. Read in the markdown file
  2. Parse language snippets like "go" or "javascript"
  3. For each snippet, create a gist
  4. Keep a struct with references to the snippet and the gist
  5. Replace the snippets with their associated gist URLs
  6. Write a new markdown file, with code snippets replaced by gists for Medium

If you interested in checking out the source code, you can check it out in the repo. I'd like to put together some tests before I decide to sell anybody on it, but it'll be open source in any case. Currently the usage goes like this:

./gist-generator -f example.md -token <GITHUB_ACCESS_TOKEN>
Enter fullscreen mode Exit fullscreen mode

If I make a web client from which this process would be run, the user would sign in through OAuth with their Github account. This would eliminate the step where they have to manually generate their personal access token and paste it into the CLI.

That's something that would be nice to have but isn't in my crosshairs just yet -- let's check out some more pressing issues.

Problems Encountered

While solving the problems I cited earlier in this post was my main objective, there were a couple of places in writing this script where I ran into trouble.

One such problem was handling the formatting of the gist that would be created. What looks good in my markdown file may not necessarily look good in the gist.

In my first pass, I had excess whitespace due to newlines and oversized tabs. By iterating over the bytes in the code snippet, I could remove unnecessary newlines and replaces tabs with spaces. Here's an example:

// Unicode constants
const (
    TAB     = 9
    SPACE   = 32
    NEWLINE = 10
)

func removeInitialNewline(content string) string {
    if content[0] == NEWLINE {
        content = content[1:]
    }
    return content
}

func replaceTabsWithSpaces(content string) string {
    var buffer bytes.Buffer
    for _, c := range content {
        if c == TAB {
            buffer.WriteRune(SPACE)
            buffer.WriteRune(SPACE)
        } else {
            buffer.WriteRune(c)
        }
    }
    return buffer.String()
}
Enter fullscreen mode Exit fullscreen mode

Now out of all of the Go code I wrote for this script, why did I show the trimming of the newlines and tabs first? Why didn't I show the structs I created to represent gists and snippets?

Because the whitespace issue was the key to solving one of my biggest problems stated in the beginning! It was the easiest problem that yielded the biggest results.

There were also some lower level issues that dealt with how to best parse files or build strings using byte buffers. I'll tackle this in a future post since it deserves its own.

Lessons Learned and Future Projects

I loved this project for several reasons:

  • It was small enough that it wasn't too intimidating
  • The end goal was clear and practical: to improve the quality of my blog posts
  • I had the opportunity to practice developing a Go command line tool
  • It can easily fit into an ecosystem of similar blog-productivity tools

By that last point, I mean that the gist generation step may be one of several.

I could write a similar script that would eliminate the excess markdown characters from the blog text. I discovered one way to partially do this step via this post.

Medium allows one to import a markdown file if its pasted in a gist (since it requires a URL). This causes it to handle backtick and asterisk wrapped words properly. It even handles the hyperlinks within your text.

To handle grammatical mistakes I could add a step that perhaps calls a simple Grammarly*ish* API. If an open source solution doesn't already exist, that might be another future project idea.

I could even create a dashboard that would allow me to easily delete gists if a mistake in my script cause faults in any of them. The current official gist dashboard is rather inefficent for editing or deleting gists.

Finally, I've been floating around the idea of a cloud-based markdown editor that would be accessible from mobile. You know, when I'm waiting in line at Disneyland and want to edit one of my articles. Times like these.

Wrapping Up

So much to do, so little time it seems. I suppose my next step is to cover those other aesthetic issues regarding importing stories from markdown.

At this point, any project that reduces the friction involved in writing technical content is valuable.

This project arose from my desire to eliminate some of the pain associated with tons of redundant editing. Some pains are cause by lack of familarity with platforms, others are caused by an actual limitation in the platform.

Writing articles like this one and embarking on similar projects solves both problems: you learn the capababilites of that platform all the while improving the parts where it falls short.

A great learning experience, no doubt, in the name of productivity.

Again, here is the link to the project's repo.

Curious for more posts or witty remarks? Give me some likes and follow me on Medium, Github and Twitter!

. . . . . . . . . . . . . . . . .
Terabox Video Player