Optimizing Away C++ Virtual Functions May Be Pointless - Shachar Shemesh - CppCon 2023

Optimizing Away C++ Virtual Functions May Be Pointless - Shachar Shemesh - CppCon 2023

CppCon

3 месяца назад

14,536 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@reneb86
@reneb86 - 12.05.2024 10:35

I think every developer instinctively has felt that the "virtual functions are slow" claim is overly generalized. Virtual functions have a practical purpose, and that practical purpose can outweigh a need for performance. And given a proper context, the correct implementation of virtual functions will also yield performance gains. What is surprising here is that compiler optimization and cpu management seems to be a decade or two ahead of what developers are thinking.

Ответить
@nicholastheninth
@nicholastheninth - 07.04.2024 23:46

Well there was once where I decided to not inline a function inside a loop that gets called a few thousand times per frame, and the performance shot down to 60% of the perf when force-inlining it, even though it really shouldn’t have mattered (the function does a non-insignificant amount of work).

Ответить
@matiasgrioni292
@matiasgrioni292 - 27.03.2024 01:35

The talk is good, but the presenter's handling of questions certainly does not make me think that he thinks his talk is not a period.

Ответить
@TechTalksWeekly
@TechTalksWeekly - 15.03.2024 17:25

This talk has been featured in the #6 issue of Tech Talks Weekly newsletter.
Congrats Shachar 👏👏👏

Ответить
@Jwareness
@Jwareness - 13.03.2024 14:32

Virtual functions are orders of magnitude slower in real-world use cases due to the penalties incurred by calling different derived functions in the same code path as you can't take full advantage of modern hardware branch prediction, memory prefetching and data/instruction caches. For a serious comparison, see '"Clean" Code, Horrible Performance' by Molly Rocket.

Ответить
@RPG_Guy-fx8ns
@RPG_Guy-fx8ns - 12.03.2024 11:48

if you are making systems that handle large amounts of data quickly, it ideally should not be object oriented or use virtual functions or inheritance. It should be data oriented, and parallelized, like a GPU particle system. Functions and data should be separated, and functions should act on packed arrays of data organized by traits, not objects. Avoiding cache misses is not pointless, and removing inheritance and virtual functions is only part of the solution. This can speed up your code by very significant amounts, especially in video games or large simulations.

Ответить
@gregthemadmonk
@gregthemadmonk - 11.03.2024 12:54

Interesting to see how running the benchmark from the StackOverflow question on a modern compiler (thankfully the OP provides QuickBench links) results std::visit being actually faster on Clang 17 + libstdc++ (and slower on libc++ :) ) than a virtual function call. Guess it just further proves the point of the talk :)

Ответить
@iddn
@iddn - 10.03.2024 01:09

The amount of people who share micro-benchmarks that they’ve run on laptops is crazy. Laptops are useless for this

Ответить
@sirhenrystalwart8303
@sirhenrystalwart8303 - 10.03.2024 00:30

Forget about performance. Virtual functions make your code base hard to understand. Only use when absolutely necessary.

Ответить
@vladimir0rus
@vladimir0rus - 09.03.2024 22:04

The important takeaway from this whole talk, IMHO, is that "If it is fast enough - use it!".
For some people a Python is fast enough.
There is no way a virtual call can be faster than a direct call so the rest of the talk was about inability of the author to measure the difference properly.

Ответить
@cedricmi
@cedricmi - 09.03.2024 20:28

I agree. But in some cases, you do care for performance differences, even if the performance is "good enough". For example, when you're at a scale where an app electricity cost is a consideration, if not the main consideration.

Ответить
@mariogonzalezmunzon7076
@mariogonzalezmunzon7076 - 09.03.2024 16:52

Very great talk, deep explanation on hard concepts, will need to rewatch it many times to really understand it.

Ответить
@JSzitas
@JSzitas - 09.03.2024 15:12

I think unless you show median/minimum timings for a piece of code, benchmarks are indeed meaningless (almost by design).

Ответить
@user-ck9eh6ld5y
@user-ck9eh6ld5y - 08.03.2024 20:21

Inheritance, virtual functions, that's for boomers, totally uncool ;-)

Ответить
@Fudmottin
@Fudmottin - 08.03.2024 17:55

I always thought virtual functions were a nice way to create maintainable code at the cost of a small bit of performance due to the VTABLE lookup for an indirect call. I was willing to accept that cost for functions that did enough work that the call was a trivial portion of it. Now I'm wondering, why worry?

Ответить
@tomkirbygreen
@tomkirbygreen - 08.03.2024 08:21

Love the take away that design is supreme, but also the energy and enthusiasm with which the topic is explored. Awesome stuff sir. Kudos.

Ответить
@johnmcleodvii
@johnmcleodvii - 08.03.2024 08:16

Premature optimization is a huge easte of time. Only try to optimize if your program is too slow. Then you need to figure where in your code it is too slow, and why that section os too slow

Two distinct cases i eorked on come to mind.

Case 1. Every access to a file involved opening and closing the file. Tye solution eas to keep the file open. Unfortunately this was done in a garbage collected language, so every instance of the class had to be found, modified, and tested. And there were a couple thousand instances - several weeks of work for yhe programmers, several more for testing.

Case 2. The innermost loop was slow in an image editor. Inspection of the asembly language showed that the compiler was reading a byte, changing one bit, writing the byte, reading the same byte, changing the next bit, writing the byte again. This was done 8 times before the next byte was retrieved. The solution was to hand write the assembly code for the innermost loop. We tewrite it duch that the byte eas read, all 8 bits were modified, and the byte was written. Speedup of around a factor of 8.

Ответить
@younesmdarhrialaoui643
@younesmdarhrialaoui643 - 08.03.2024 03:40

10/10

Ответить
@moshegramovsky
@moshegramovsky - 08.03.2024 03:31

Great video! I have wondered about all the hate I've seen for virtual functions and inheritance-based polymorphism in the last few years. Some of the proposed solutions are baffling in terms of performance. Especially some type erasure strategies like variants.

Ответить
@bescein
@bescein - 08.03.2024 02:58

If he wanted to prove benchmarks show nothing he chose test subject poorly. The main downside of virtual functions is that they cannot be inlined. Thats the whole reason people tend to avoid them.

Ответить
@colonelmax1
@colonelmax1 - 08.03.2024 02:05

Those benchmarks are not representative at-all. No warmup, no CPU power state and frequency locking, ....

Ответить
@ABaumstumpf
@ABaumstumpf - 08.03.2024 01:01

I have seen code with comments about avoiding looping the data a second time, or avoiding to sort a vector with like 20 elements.... in a process that takes several milliseconds just for receiving and tracing out an even. Cause yes of course all events are traced and that all happens synchronously on the handling thread. With conversions from strings to c-strings to string-view back to strings and going through 2 layers of iostream-conversion and then to disk.

I put every non-essential trace-call behind a "if(verboseTracing){...}" and doing my own string-formatting, concatenating multiple lines and buffering. The architect was telling me that i was introducing a lot of branching and other stuff that would kill performance.... it managed to get a 100x increase in throughput and over 99% reduction in latency......... all we needed was to reduce the systemload by like 20% to stay within save margins so this was a nice success there.
If you are at the point that you need to care about any potential performance costs of virtual functions than you have long since surpassed the point were you can just "reason" about the code and instead have to do a lot of testing on the exact hardware it will be deployed upon.

Ответить
@alskidan
@alskidan - 08.03.2024 00:41

My conclusions: benchmarking is hard, especially when you don’t know what you’re measuring 🤣

Ответить
@lol785612349
@lol785612349 - 08.03.2024 00:22

So let's summerize: Premature optimization is the root of all evil and the optimization is premature earlier than you think.

Ответить
@AlfredoCorrea
@AlfredoCorrea - 08.03.2024 00:03

The advantage of variant is the value semantics it provides (runtime values), while Virtual functions are ideal to interface with “plugins” (unbounded dynamic behavior). Neither use case is performance driven in principle. IMO the good news is that there are red flags for each use case: if you are using get_if or holds_alternative all the time, variant is not the right choice; If you are using dynamic_cast all the time Virtual is not the right choice. Also, it is not binary choice, there is a whole spectrum of type erasure techniques that happen to include these two cases.

Ответить
@chrisminnoy3637
@chrisminnoy3637 - 07.03.2024 23:39

Guys, maybe you should study the branch prediction strategies of modern cpus and cache management. That will give you insight how specific cpus completely eliminate the time difference between a direct and indirect call in more than 90% of the time. But smaller code will run faster because more code fits into cache lines, and indirect calls are a bit bigger. So then the performance of cache handling is again the determinating factor. To conclude, just use virtual functions when it makes sense.

Ответить
@Dominik-K
@Dominik-K - 07.03.2024 23:39

Really good presentation! Performance can be important and not losing oneself in assembly if not necessary are important

Ответить
@CartoType
@CartoType - 07.03.2024 23:29

A very good presentation. Clear, to the point, correctly paced. And a surprising and useful conclusion.

Ответить
@user-ge6yb9rj9o
@user-ge6yb9rj9o - 07.03.2024 23:09

Revisiting the debate on virtual method performance seems redundant, even if not as slow today as some believe.
The justification for OOP and therefore virtual methods has become less pressing as the industry shifts towards composition over inheritance,
so let's focus on the best practices that have evolved rather than defending outdated ones.
Benchmarking is definitely not useless, it compliments profiling in the same way that unit tests do for integration tests.

Ответить
@r2com641
@r2com641 - 07.03.2024 22:12

You right, it’s better just to optimize out the whole c++ and use a sane language, thanks god we are in 2024 and have choices 👌

Ответить
@sampro454
@sampro454 - 07.03.2024 22:01

10/10 "who am I?" slide

Ответить
@videofountain
@videofountain - 07.03.2024 21:20

Thanks. Cache, Benchmark, Profile. Interesting casting of doubt 🎊 to make more work for programmers and have a community consider more of the total measurement picture.

Ответить
@surters
@surters - 07.03.2024 21:12

Branch target buffers have increased over the years, that might be why it suddenly is a different scenario now.

Ответить
@soumen_pradhan
@soumen_pradhan - 07.03.2024 20:21

I now know less.

Ответить
@PaulTopping1
@PaulTopping1 - 07.03.2024 20:06

The important takeaway from this whole talk, IMHO, is that design issues should determine whether you should use virtual functions, not performance considerations. Only when something doesn't perform well enough for the task should you worry about eliminating virtual functions and only as part of a comprehensive performance analysis where that is only one of many strategies under consideration. Makes sense to me.

Ответить
@TheOnlyAndreySotnikov
@TheOnlyAndreySotnikov - 07.03.2024 20:06

The primary source of virtual function slowness is that the compiler can't inline them and thus can't optimize across the function boundaries. A simple non-virtual getter, in most cases, will be optimized to a register operation; a virtual getter, on the other hand, will require an indirect call and will be orders of magnitude slower. You are done if you have such a getter in an often-executed loop.

Ответить
@ArthurGreen-bw3sb
@ArthurGreen-bw3sb - 07.03.2024 19:55

The thing about which cases are "natural" for inheritance seems to be an important issue. If you've spent a decade writing Java and using GoF patterns, then inheritance becomes the natural solution for a lot of problems that you might use other techniques for otherwise.

Ответить
@LarryOsterman
@LarryOsterman - 07.03.2024 19:38

A great exploration. And a graphic demonstration of why doing performance analysis the world of modern processors is unbelievably hard (edited to remove a reference to benchmarking, since it's a broader area of challenges).

Ответить