• Aux@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    1 month ago

    You can start by running sudo apt install tesseract-ocr and then reading its docs.

    • MacN'Cheezus@lemmy.today
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      1 month ago

      It appears to be as simple as tesseract <infile> <outfile>. Possibly could even pipe (or tee) the screenshot straight into that and save both an image and a text file in a single command line.

      So something like this should do the trick:

      gnome-screenshot -f - | tee /Microsoft/yourPrivacy/$(date +%s).png | tesseract - /Microsoft/yourPrivacy/$(date +%s).txt
      
      • Aux@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        29 days ago

        It is much better to search using ElasticSearch or Sphinx. Grep is super slow, non indexed and can’t do natural language full text searches. It’s pretty much useless for any real world text search you’d want from OCRed content. And all these better tools are free and open source, so really a no brainer.