• Overzeetop@beehaw.org
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    8 months ago

    a toy for professional workloads

    [rant]

    I think this is one of those words which has lost its meaning in the personal computer world. What are people doing with computers these days? Every single technology reviewer is, well, a reviewer - a journalist. The heaviest workload that computer will ever see is Photoshop, and 98% of the time will be spent in word processing at 200 words per minute or on a web browser. A mid-level phone from 2016 can do pretty much all of that work without skipping a beat. That’s “professional” work these days.

    The heavy loads Macs are benchmarked to lift are usually video processing. Which, don’t get me wrong, is compute intensive - but modern CPU designers have recognized that they can’t lift that load in general purpose registers, so all modern chips have secondary pipelines which are essentially embedded ASICs optimized for very specific tasks. Video codecs are now, effectively, hardcoded onto the chips. Phone chips running at <3W TDP are encoding 8K60 in realtime and the cheapest i series Intel x64 chips are transcoding a dozen 4K60 streams while the main CPU is idle 80% of the time.

    Yes, I get bent out of shape a bit over the “professional” workload claims because I work in an engineering field. I run finite elements models and, while sparce matrix solutions have gotten faster over the years, it’s still a CPU intensive process and general (non video) matrix operations aren’t really gaining all that much speed. Worse, I work in an industry with large, complex 2D files (PDFs with hundreds of 100MP images and overlain vector graphics) and the speed of rendering hasn’t appreciably changed in several years because there’s no pipeline optimization for it. People out there doing CFD and technical 3D modeling as well as other general compute-intensive tasks on what we used to call “workstations” are the professional applications which need real computational speed - and they’re/we’re just getting speed ratio improvements and the square root of the number of cores, when the software can even parallelize at all. All these manufacturers can miss me with the “professional” workloads of people surfing the web and doing word processing.

    [\rant]

    • Paranoid Factoid@beehaw.org
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      8 months ago

      So, one point I’ll make on the hardware assist you discuss is that it’s actually limited to very specific use cases. And the best way to understand this is to read the ffmpeg x264 encoding guide here:

      https://trac.ffmpeg.org/wiki/Encode/H.264

      The x265 guide is similar, so I won’t repeat. But there are a dizzying range of considerations to make when cutting a deliverable file. Concerns such as:

      • target display. Is the display an old style rec709 with 8 bits per color, SDR with of six and a half stops dynamic range, etc? Is it a rec2020, 10 bits per color, about eight stops? Is it a movie projector in a theater, with 12 bits per color and even more dynamic range? When producing deliverables, you choose output settings for encode specific to the target display type.

      • quality settings. Typically handled in Constant Rate Factor (CRF) settings. If you’ve burned video files, you’ll know the lower the CRF number the higher the image quality. But the higher the image quality the lower the overall compression. It’s a tradeoff.

      • compression. The more computation put to compression the smaller the video file per any CRF setting. But also the longer it takes to complete the computation.

      This is only for local playback. Streaming requires a additional tweaks. And it’s only for a deliverable file. In the production pipeline you’d be using totally different files which store each frame separately rather than compress groups of frames, retain far more image data per frame, and are much less compressed or entirely uncompressed overall.

      The point of this is to highlight the vast difference in use cases placed on encoding throughout various stages in a project. And to point out for video production you care about system I/O bandwidth most of all.

      But hardware encode limits you to very specific output ranges. This is what the preset limitations are all about for say nvidia nvenc hardware assist x264 in ffmpeg. The hardware devs select what they think is the most common use case, say YouTube as an output target (which makes network bandwidth and display type presumptions), and targets their hardware accel for that.

      This means most of that marketing talk about hardware assist in M series chips and GPUs etc is actually not relevant for production work. It’s only relevant for cutting final deliverable files under specific use cases like YouTube, or Broadcast (which still wants 10bit ProRes).

      If you look at just x264 settings, the hardware accel presets are so limited most times you’d still be cutting with software encode. Hardware encode comes into play with real time, like streaming and live broadcast. The rest of the pipeline? All software.

      • Overzeetop@beehaw.org
        link
        fedilink
        arrow-up
        3
        ·
        8 months ago

        Indeed! It makes the benchmarks that much more disingenuous since pros will end up CPU crunching. I find video production tedious (it’s a skill issue/PEBKAC, really) so I usually just let the GPU (nvenc) do it to save time. ;-)