In our latest , just out on @PeerJCompSci and , we show how to visually recognize the programming of images, among many languages (150 in our experiments) and with high accuracy [1/n]

For the gory details: we have used , specifically convolutional neural networks (CNN) pretrained on generic image recognition tasks and then adapted using to the problem domain of visual programming language identification. [2/n]

We trained on 300k real-world code snippets from popular repositories extracted from @swheritage, achieving 92% precision and recall. Even more gory details can be found in the package. Feedback welcome, enjoy! [3/3]

Do you know of similar software for natural languages?

We are building a database with translations.

And we would like the adding of new data to the database to be as easy as possible. So it would be great if people could just give the DOI/URL of the original and translation and the system would determine/estimate the languages. Or the system would pre-fill a web form with a reasonably accurate guess and people would normally only have to confirm.


I noticed the inactive @PeerJCompSci mention there. You might want to send a suggestion to PeerJ to create a (non-robotic) Fediverse account, to match with the PeerJ claims to be a journal in the spirit of #OpenScience, rather than only promoting the birdsite. I already sent a message along those lines - two independent messages from PeerJ/PeerJCompSci authors may be more convincing than just one...

