Formatting notes

Unlike metadata and annotations, Zotero notes can't offer much for custom formatting. If that is important to you, perhaps consider using Zotero 6 and its annotation features.

Limitations

The Zotero API returns notes as a block of HTML code:

<div style=\"background-color: Yellow\">Annotations</div><p>\"After all, argues Balthasar, \"in a world without beauty... the good also loses its attractiveness, self-evidence why it must be carried out.\" Why not prefer evil over good? \"Why not investigate Satan's depth?\" (<a href=\"zotero://open-pdf/library/items/A12BCDEF?page=1\">Smith 2003:1</a>)</p>

If your note contains multiple highlights, comments, or complex formatting (headers, lists, etc), several steps are needed to properly import it to Roam:

  • Separating the note into blocks

    • Getting this right is very dependent on how your Zotero notes are created (manually or via plugins like Zotfile).

    • The extension uses the newline character, \n, as the default separator, but you can change this in your user settings if it doesn't work with your setup.

    • If you use Zotfile to extract PDF highlights into Zotero notes, make sure to take a look at Zotfile's hidden settings, and adjust them to be compatible with the separator you choose in your zoteroRoam settings.

      • For example, if your Zotfile settings use <p></p> tags to surround comments and highlights, use </p> or the "Paragraph" preset as your notes divider.

  • Removing the extra HTML markup, like <div> tags

    • That's not easy to do in a comprehensive way, short of building a full HTML-to-Roam-Markdown converter - which is beyond the scope of the extension.

    • Feel free to create a GitHub issue for any markup that isn't supported by the extension's parser -- but be aware of the limitations presented here.

  • Converting meaningful HTML tags, like <b>, into Roam markup

    • This is the easy part!

With the above, currently the best option to work with Zotero notes is to use a tool like Zotfile, and tweak its formatting settings as well as the zoteroRoam separator so that they're compatible.

If you want more control over the output, and don't want to migrate to Zotero native annotations, you have the option of writing a custom JavaScript function to format your notes.

JavaScript template

If you decide to write your own parsing function, know that you'll have to clean all the HTML markup yourself. This is not recommended, due to the complexity involved, but this is supported.

Example

Write your function, with the syntax window.myParser = function(notes){...}, and supply its name in your user settings.

// Utility function to cleanup the HTML markup
function parseNoteBlock(block){
	let cleanBlock = block;
	const formattingSpecs = {
		"<blockquote>": "> ",
		"</blockquote>": "",
		"<strong>": "**",
		"</strong>": "**",
		"<em>": "__",
		"</em>": "__",
		"<b>": "**",
		"</b>": "**",
		"<br />": "\n",
		"<br/>": "\n",
		"<br>": "\n",
		"<u>": "",
		"</u>": ""
	};
	for(const prop in formattingSpecs){
		cleanBlock = cleanBlock.replaceAll(`${prop}`, `${formattingSpecs[prop]}`);
	}
	
	// HTML tags that might have attributes : p, div, span, headers
	const richTags = ["p", "div", "span", "h1", "h2", "h3", "h4", "h5", "h6"];
	richTags.forEach(tag => {
		const tagRegex = new RegExp(`<\/?${tag}>|<${tag} .+?>`, "g"); // Covers both the simple case : <tag> or </tag>, and the case with modifiers : <tag :modifier>
		cleanBlock = cleanBlock.replaceAll(tagRegex, tag.startsWith("h") ? "**" : "");
	});
	
	const linkRegex = /<a.+?href="(.+?)">(.+?)<\/a>/g;
	cleanBlock = cleanBlock.replaceAll(linkRegex, "[$2]($1)");
	
	return cleanBlock;
}

// The below function does the following :
// 1- flattens the nested Array structure
// (i.e, does not care if there is one note or several)
// 2- parses each block to detect if it contains "highlight-red"
// (if it does, an #important tag is added to the block)
// 3- runs each block through the custom utility above to clean the HTML markup,
// and returns the result to be imported into Roam

window.myParser = function(notes){
  // Step 1
  let blocks = notes.flat(1);
  // Step 2
  blocks = blocks.map(b => {
    return b.includes("highlight-red") ? (b + " #important") : b;
  });
  // Step 3
  return blocks.map(b => parseNoteBlock(b)).filter(b => b.trim());
}

Last updated