Show HN: Finding similarities in New Yorker covers

(shoplurker.com)

18 points | by tkp-415 1 day ago

6 comments

araes 8 minutes ago
Cool project, and at least with the New Yorker images there's such an enormous amount of diversity that there's actually quite a bit to compare against.
Also, just really cool to see quite how much diversity there has been in New Yorker front page images. Not actually that many archives that provide such a clear, easily viewable format just for browsing through all the art and looking at how the New Yorker has changed it's art style over time. Quite a few would probably qualify as "art" in and of themselves.
Also, some kind of neat discoveries while browsing through the archive. "Crazy, that looks like a Far Side ... wait, that's cause Gary Larson drew the cover." No idea Gary Larson did any magazine covers. [1]
[1] 2003-11-17, Gary Larson, https://shoplurker.com/labs/newyorker-covers/covers/2003/590...
Did something kind of similar a while back with Magic the Gathering cards, except it was using the Perceptual Hash from https://www.phash.org/ The visual clustering display (such as this example from 5th Edition [2]) I think is what some people are suggesting would be valuable. Provide an idea about "why" they're related.
[2] https://araesmojo-eng.github.io/araesmojo-html/images/Projec...
Been a long time (2018), yet pretty sure it was two dimensional "Image Feature Extraction" kinda of like this other example [3] with card colors collapsed to a single average color to give an idea of color pattern distribution.
[3] https://digitalsloan.com/project-blog/cbri
smusamashah 1 day ago
I dont understand the UI at all. When I click All or something withij brackets, what am I supposed to see? Covers similar to what I clicked? But the covers I see don't seem similar to me at all no matter what I click. What am I missing? Or may be I am expecting a different kind of similarity.
[-]
- tkp-415 1 day ago
  The confusion is understandable as the comparison is basic and uses image hashes (https://pypi.org/project/ImageHash/), which are pretty surface level and don't always provide reliable "this image is obviously very similar to that one" results.
  You are correct that when you click something in the brackets, the results returned are covers similar to what you clicked.
  Still have a lot of room for improvement as I go further down this image matching rabbit hole, but the comparison's current state does provide some useful results every so often.
  [-]
  - smusamashah 15 hours ago
    As suggested in another comment, CLIP should give the images that actually look similar. This is a great collection of images, and using those you will find very similar covers.
    http://same.energy/search?i=yN5ou
    Also https://mood.zip/
Samin100 23 hours ago
This is awesome! Using a CLIP or Dino v2 model to produce image embeddings would probably improve the similarity search a lot - kind of similar to http://same.energy/ .
multisport 23 hours ago
I like it! Would be nice to have some visual indication on "what" was similar, or why, or even how much? Or an example of "this is similar to that because ...". Maybe a visualizer of the algorithms?
llstr 1 day ago
Hey! Cool! Does your code use some of the public libs available (pHash, hmsearch,…) or did you start coding from scratch based on research papers? Can one fork a git repo?
Anyway, KUTGW
[-]
- tkp-415 1 day ago
  Hashes generated using https://pypi.org/project/ImageHash/, then a hamming distance is calculated with SQL.
  Unfortunately I don't currently have a repo that can be forked.