As researchers increasingly store and share their data in digital forms, computer-readable ways to describe compounds become ever more important. Several computer codes are already available for describing discrete molecules, but they struggle with larger assemblies like polymers.

Researchers say this is slowing down polymer informatics and chemistry in general. So a group of chemists led by Bradley D. Olsen and Tzyy-Shyang Lin at the Massachusetts Institute of Technology developed a solution (ACS Cent. Sci. 2019, DOI: 10.1021/acscentsci.9b00476). The group’s new language, which is based on a 30-year-old code known as the simplified molecular-input line-entry system (SMILES), is called—rather appropriately—BigSMILES. BigSMILES describes a polymer as a comma-separated list of monomers within curly brackets, with additional symbols to describe the bonds between monomers. For example a linear polymer segment from ethylene and 1-butene monomers would be written as {$CC$,$CC(CC)$} or {$CC$,CCC($)C$}. The researchers hope the system will have multiple uses and improve communication between researchers.

