Future video codecs

>different profiles trained using AI machine learning bullshit (e.g. anime profile, live action, etc.)
>compress everything down to a single bit
are you ready for the future user?

Attached: file.png (1200x630, 105.13K)

Other urls found in this thread:

ottverse.com/av2-video-codec-evaluation
codecs.multimedia.cx/2021/08/about-upcoming-av2
aomedia.googlesource.com/aom/ /refs/heads/research2/doc/
twitter.com/SFWRedditGifs

av2 doesn't have either of those features. although vvc does kind of have the first one, since there was something like 4000 hours of test footage of various content types tested against existing dumb metrics to program the presets. as opposed to acv and hevc which did profiles entirely by hand and based on theory

sounds gay. any info on av2?
I would assume any "modern" video codec in development should:
1) use machine learning to some extent and have different profiles specifically trained for specific content types
2) VR/3D/360 video taken into consideration

ottverse.com/av2-video-codec-evaluation
codecs.multimedia.cx/2021/08/about-upcoming-av2
aomedia.googlesource.com/aom/ /refs/heads/research2/doc/

meme codec, only useful for anime

thank you user
??? what, av1?

>only useful for garbage modern anime
Ftfy.
Clicking the bucket tool is high art.

Sauce on VVC having training footage?

that will absolutely not fly
You have three major issues with using AI in your codec
1. Companies are unwilling to share data. Companies like IBM are working on federated learning which should alleviate these issues, but they'll always be present
2. Who said they're going to share their trained model? No, you'll get AV2-Youtube, AV2-Netflix, AV2-Prime
3. Models are updated over time. Yesterday's movie could look completely different on tomorrow's model.

I imagine if they use a NN in the standard then all weights will be explicitly specified in the standard and the model will not change. You'll need to know the weights to decode the video - they definitely won't share the training dataset, but they will have to share model architecture and weights in any case.

Also a NN solved this captcha for me. T0XDVS. Thanks NN.

>billion page standard
Imagine having to print out DALLE-2

Well, there's no way to add a neural network otherwise. I guess weights in the file with a movie? But then it's going to impossible to play it starting from middle, you'd have to downloadf the beginning to get the weights.

Anyway, not every NN has to have gorillions of parameters.

The entire benefit of using a machine learning model is that the laws of information theory don't apply, because you're retrieving information from an external source.
Embedding the neural network in the video would negate this benefit and your video wouldn't be any smaller.

Depending on the size of video and the size of NN you can get a total benefit. For any size of neural network, if it does bring improvement to video encoding, there will be length of video after which adding NN to file and compressing the video with it will be save space compared to not using NN. Not sure what you mean by laws of information theory.

Traditional compression algorithms are based on information theory.
You work with what you're given. You can't create information out of nothing. Given these two basic statements, create an algorithm that shortens an input, and it's corresponding decoder.

Imagine if you said the asymptotic runtime of all functions was O(1) because "lol I just pulled it from a huge precomputed database."
Every academic would be on to your bullshit. That's basically what you're doing with machine learning.

>a total benefit
I claim this benefit is difficult enough to compute that 90% of people would not have the time to do it, and that having a standardized single model would be considered more practical

true, it can't retain grain or low contrast textures

I was in favor of standardized model from the beginning. What I'm saying is, you are incorrect about negating the benefit, and invoking information theory while being clearly incorrect just sounds silly. As another example, LZW binary can be compiled to few tens of kilobytes, and you can send an email with compressed text and an LZW binary included and it would provide huge benefit compared to sending plaintext for many real life plaintexts.

Then it's up to you to prove you can beat LZW with a machine learning model.
Or, more on topic, an AV1 decoder, for some specified CRF level.
I think you will have a hard time.

What the fuck are you talking about.

If you're claiming that you can deliver a smaller video with the neural network attached than AV1 (hardware decoder, 0 disk space) you'd better be ready to prove it
Because remember, your theory in a vacuum means nothing if it's useless and adds complication for no reason.

The assumption above was
>if it does bring improvement to video encoding
I'm assuming NN helps. I do not have to prove it. With this assumption, it should be obvious that there are cases where including the NN will help, and I omitted the reasoning, but if you are retarded and can't see, do reply and I will write it out.

You're assuming a fiat.
Obviously compression beats no compression, but no one wins prizes for second place.
The thread is about future video codecs, not researcher grant money pits

Then just fucking say that in the first place, that you do not think using NN will help. None of that bullshit about information theory is correct. I feel sorry about writing a proper response to your retarded ass.

You didn't assume that from the tone?
Perhaps you should look in a mirror.
And you merely intentionally misunderstood the information theory part.
All I said is that traditional algorithms don't add information from an external source, while algorithms enhanced with machine learning are best served from an external source.
Embedding that external source with your data diminishes that benefit.
The only reason you would ever even do that is simply because of industry coincidences not based in math.