Answering Basic Assembly Language Questions - Assembly Language for Beginners

1 год назад

12,674 Просмотров

Комментарии:

Michael Wilkes - 13.09.2023 07:30

If we can use xor to get perfect float math, why dont all cpus just always do that? Why is floating point error still a problem we have to deal with? Even assuming that trick does not work with multiply and divide, making add and subtract perfect would be amazing.

Ответить

pistolsatsean - 12.07.2023 09:03

I got a question bout assembly.
If you have a determinate size loop, does it execute faster if written out line by line? (even if marginally so)

Ответить

artekmeister - 08.03.2023 21:35

I've been using ChatGPT in a spreadsheet and remembered you. What have you used ChatGPT for in assembly? These questions might be excellent prompts to explore how ChatGPT would suggest to do it. Your mileage may vary. Thanks for making these assembly videos.

Ответить

kjrl - 26.02.2023 18:45

I've been learning about the open source risc-v assembly. liking it so far.
Also. keep up the good work.

Ответить

Brian Park - 28.01.2023 10:17

Your second scheme is what I refer to "converting algorithmic operation to arithmetic operation" eliminating branches. The routine is "straight-line code". The complexity of code is proportional to the number of branches in it.
Your scheme for calculating the number of 1's in a number only works if you have the special instruction. Without one, I have scheme for determining if the number has one or fewer 1's in it. Copy the number to another register. Decrement one of the numbers. Then do a bitwise AND between the 2 numbers. If it is zero, the number had one or fewer 1's.
ROUNDING CRAP: get RID of floating point math! Floating point math belongs only in hastily "slapped together" programs written to get a quick answer. Most programmers are too lazy to properly scale their numbers.
For the sort: as you explain in your sort videos, there are 3! possible = 6 outcomes. Do 3 compares, say 1&2, 2&3, & 1&3. After each compare, shift the carry flag (is set if 1st arg >= 2nd arg) into register with SHIFT LEFT (with carry) into precleared register. You have 3-bit number (8 possible outcomes, of which 6 are "legal"). Use this to index into a look-up table of the swaps required. Put the index of the "get" of the swap into the table of 8 entries.
Example: let's say : A>B, A<C, & B<C. The 3 C-flags would be 1, 0, 0. These are shifted left into a cleared register forming (binary)4. There are 8 tables, each 3 cells long. The 4th of these contains 1, 0, 2, which indicates the "from" swap order is B, A, C. So the program enters straight-line code where we get "number displaced by 2" (C) & load it into the 1st of the output list, next get "number displaced by 0" (A), move to the 2nd element in the list, & finally, move B into the last element in the list. The resulting numbers themselves are not subject to any math operations, so cannot be corrupted.
Too bad processors don't have way to shift the zero flag into a register! Even fancier sorts could be done where equality must be tested.
Lookup tables can do other operations that can require insane amount of operations & time to compute. An example is bit scrambling or other re-arrangement, including insertion/removal of parity bits. The source bits can index into a table, & the scrambled bits read out from the table. Multiple inputs can index into multiple tables which are OR'd or XOR'd together. When the source has a lot of bits (the table exponentially grows to insane size), the source can be broken up into smaller pieces & processed in multiple tables. An example I programmed is the "conventional" CRC16 code (used on floppy disks for error detection) for use in a robot radio link. Code on small 8-bit machine would require ~120 clock cycles to compute 1 new bit of output with carefully-crafted assembly code. With 3 256 byte tables, my routine can calculate 8 new bits or output using only 27 clock cycles (7 or 8 instructions).

Ответить

devmishra18 - 20.10.2022 23:34

I don't even wanna learn assembly, but I still watch your videos as they make me feel smart.

Ответить

Joseph Moore - 22.08.2022 17:52

🌸 𝓹𝓻𝓸𝓶𝓸𝓼𝓶

Ответить

Ebb Flow - 15.08.2022 07:15

Excellent stuff, amazing channel!!!!!

Ответить

Andre Poelman - 11.08.2022 16:18

Nice video! I am probably too late to the party, but I think you didnt answer question 3. The question was how to count bits in a dword on a 8086 (16 bit processor). No fancy bitcount instructions there.

Ответить

FalcoGer - 11.08.2022 12:15

Okay, adding two numbers together is easy. but that's not really helpful, is it? What we want isn't to read the result off in the debugger, or to change our code to change which numbers to add. In other words, I/O is missing. I know to deal with I/O you use syscalls, system interrupts or in embedded devices access the mapped memory of attached devices and read/write values to specified addresses which map to those devices, possibly in response to an interrupt.

Ответить

Yusuf - 04.08.2022 14:24

what happened to FADD (float add) instruction, why do we use SIMD all the time for one floating-point value?

Ответить

William Drum - 02.08.2022 07:04

I have a question about ARM Assembly. If you use malloc, will the kernel try to give you a pointer that is 8-bit rotatable (i.e. can be loaded into a register using a single instruction?)

Ответить

decky1990 - 30.07.2022 18:53

Do you have Irish in your family??

Ответить

Gilman Nayeem - 30.07.2022 02:56

But that comnt is 4 months old

Ответить

Rickey Bowers - 30.07.2022 01:44

Glad to see you laying some of the groundwork for assembly.

Ответить

First-Thought // Giver-of-Will - 29.07.2022 13:16

We need to get all the language experts in a room (you being one of them) and create another assembly abstraction like C but with modern memory protection and better/modern op representation built in to the syntax but still being a "mid/low level" structured typed functional programming language that closely represents the codegen.

Ответить

AllMyCircuits - 29.07.2022 11:19

Drawback of these sorting methods is they can't be applied if there are not only "keys" but also values which should be sorted with these keys. Plane old bubble sort in that case, I presume...

Nice video nevertheless!

Ответить

Marko Dukši - 29.07.2022 10:03

Great content, always keeping me stoked for the next video. For clarity sake, don't you think you should update the leftover comments that still state that xoring "sums" or "subtracts"? You even sinfully say it out loud. 🙂It accumulates and extracts which is good enough and just what we want but far from adding or subtracting. Keep it up pleeeease! 👍

Ответить

ThatCrockpot - 29.07.2022 09:19

I'm always happy to see you posting

Ответить

Julian - 29.07.2022 08:27

Could you possibly do an explanation of how to call C library functions like puts() from assembly, or maybe just link to a guide with the correct answer? I've found a couple different guides online and I couldn't get it to work for one reason or another. I'm just not experienced enough to know why. I'm using VS2022 btw

Ответить

Neil Roy - 29.07.2022 08:01

Fascinating stuff. Love your videos, I'm always impressed by your knowledge and find this all VERY interesting! Keep up the good work, thanks. 🙂

Ответить

RSK - 29.07.2022 04:53

Hey man you got nice skills and look & talk similar to Mr.rocky balboa ! 👍🙂

Ответить

Mike's Basement - 29.07.2022 03:49

Another potential problem with the additive method is the potential for overflow. Sure, it's not likely with three values, but it is possible. What instruction format do you prefer (AT&T, Intel, NASM)? Over the years I've found I'm liking nasm more.

Ответить

Kamerton Audiophile player - 29.07.2022 02:51

Finally, somebody does a real programming.

Ответить

Aaron - 29.07.2022 00:49

I love your videos, I can just watch them without having to think too much but still learn a lot

Ответить

Olteanu Mihai - 28.07.2022 23:40

Highly underrated channel! Keep up the good work!

Ответить

SimGunther - 28.07.2022 23:03

I think the important takeaway here is that if you haven't experienced the pain of making your assembler for a fictional CPU, you don't truly know the assembly meta.

Ответить

Max Muster - 28.07.2022 18:56

x86 cornditional jump instructions:
for unsigned values
JA jump above
JB jump below
...
for signed values
JG jump greater
JL jump less

Ответить

Max Muster - 28.07.2022 18:38

Using intel syntax:
1. fast addition 16 bit instructions
LEA bx, [bx+si] ; no memory access, no flags touched, result have to fit the target
32 bit:
LEA ecx, [ecx+eax]

Ответить

lohphat - 28.07.2022 17:18

How do compilers generate object code which can run on the variety of AMD64 family CPUs?

There are so many variants which have extended complex action opcodes, how can the compiler know when to use those opcodes? I know there are compiler flags but in software distribution it’s impossible to know ahead of time which CPU instructions are supported. How is this handled at runtime?

Ответить

Bobo Bobee - 28.07.2022 17:07

ASM is so sexy…..

Ответить

Terje Mathisen - 28.07.2022 16:28

The sort3() function via XOR is very neat, you could in fact do it in scalar using 64-bit integer regs!
The scariest part when sorting fp numbers happens when you have infinities or NaNs:
I.e. the add and subtract the min/max fails completely, both with a single Inf or a single NaN, even though the ordering is at least defined when you have a mix of regular and a single inf.
With NaN, all comparisons return false!

Ответить

change_profile_n - 28.07.2022 14:58

So I found your content due to the fact that I'd like to start with an understanding on how computers work. I just started out learning Java as well as assembly. I don't do that because of commercial reasons but for reasons of fascination. Thank you for your work! Greetings from Switzerland 🍾

BTW: I have no experience in programming/computer architecture, but built an 8bit calculator in Minecraft (Redstone). This gave me a huge fascination to core concepts in CS and EE. Highly recommend this!

Ответить

gower1973 - 28.07.2022 14:13

What’s your day job? Are you a systems engineer? Or do you contribute to open source projects, is that ebook just for patreons or can anyone read it?

Ответить

zxuiji - 28.07.2022 13:55

Just a note for those implementing FPN comparisons via binary, treat the sign, exponent & mantissa as separate comparisons:

int cmpf( fpn a, fpn b)
{
int sigA, sigB, expA, expB;
intmax_t manA, manB;
/* Extract info */
...
if ( sigA - sigB )
return -(sigA - sigB);
if ( expA - expB )
return -(expA - expB);
return cmp(manA,manB,bits);
}

fpn minf( fpn a, fpn b ) { return cmpf(a, b) < 0 ? a : b; }
fpn maxf( fpn a, fpn b ) { return cmpf(a, b) > 0 ? a : b; }

Doing it that way avoids the possibility of incorrect return values (provided I got the signs the right way round in cmpf)

Ответить

Pierre Untel - 28.07.2022 12:10

Oh yes, I remember when I was a young lad and started writing my first code in AutoIt and trying to figure out what's ASM and I was like... "WTF are these? are they just there for moving numbers around, adding and subtracting them? What for?" as I was trying to create a program with nice UI and messageBox and stuff... I'm pretty sure there are many peoples out there having the same question when looking at ASM at first ;) one day it just clicks and I still have no idea what I'm doing with ASM most of the time but can read and understand some parts of it.

Ответить

Patrick L - 28.07.2022 12:01

To sort 3 numbers, I rather suggest this method which requires 6 operations (instead of 8 in the video):
min3 = min( min(a,b),c); //2 operations
max3=max(max(a,b),c); //2 operations
med3=max( min(a,b),min(c,max(a,b))); //4 operations, but can be done in 2 operations instead!
As the temporary results of "min(a,b)" and "max(a,b)" can be kept from the first steps in registers, this method requires only 6 min/max operations!!!
(BTW, no issue any more with floating point precision)

Ответить

Un Perrier - 28.07.2022 10:31

There's another concern with the sorting of 3 numbers method involving min/max/substract technique that is worse than losing floating point precision: if all three floating point numbers are close enough to the absolute maximum representation, adding them will overflow.
Not sure what an overflow looks like with floating points, but if it's like with integers you'll get something very wrong in the end.
In any case, thanks for the video, that's interesting. I'd be for a follow-up with more usual patterns and tricks. And maybe another video about ARM and RISCV assembly at some point?

Ответить

Christian Baune - 28.07.2022 10:28

Reminder of XOR⊕ property:
a⊕a=0
a⊕b=b⊕a
(a⊕b)⊕c=a⊕(b⊕c)=a⊕b⊕c
a⊕0=a

So, if we call the 3 registers a,b and c respectively and min and max as m and n respectively, we have the following expressions:
a⊕b⊕c (xor the 3)
(a⊕b⊕c)⊕m⊕M (and xor with min and max)
Let say m=a and M=c (could be any pair), then the expression becomes:
(a⊕b⊕c)⊕a⊕c
Per the above property we can remove parenthesis:
a⊕b⊕c⊕a⊕c
We can move values and group them:
(a⊕a)⊕b⊕(c⊕c)
We can also reduce the parenthesis:
0⊕b⊕0
Which evaluates to:
b

Ответить

Christian Baune - 28.07.2022 10:05

Also, fp error are "weird" as the gap between "consecutive" numbers just widen like crazy as you get far from 0. (expected since mantissa has a finite precision ^^)
I think that fp introduces too much weirdness because of that and can be a big hurdle for beginners.

Ответить

Christian Baune - 28.07.2022 09:50

Oh, just discovered CMOV 😢 That would have been extra useful when I was doing CS. The worse part is that I read through helppc at the time to find useful mnemonics we didn't learn. I don't know how I missed that one! That shows how basics can benefit everyone ^^

Ответить

ChrisM541 - 28.07.2022 09:39

Excellent video, cheers for the upload. I wish there was conditional moves back in the day with the 6502/10, then again, I like spaghetti.
For the count the set bits question, using that 8bit 6510, I would look at bit shifting e.g. ASL of the value being examined (split over 4 bytes), examining the carry flag and increment a counter if set. I wrote the following as one way to do the job.

ldy #4 ;4 bytes
lp0 lda Data-1,y
ldx #8 ;8 bits
lp1 asl
bcs BitSet
lp2 dex
bne lp1
dey
bne lp0
rts

BitSet inc Count
bne lp2 ;faster than jmp and ok to use as long as Count never wrapped to 0

;faster if below in zero page
Count byte $0
Data byte %11110000,%00000111,%10101010,%11100111

Ответить

Wayne VanWeerthuizen - 28.07.2022 08:26

How, in assembly language, does one write function that takes two or more arguments and returns a result? And how does one afterwards call that function from other languages, such Python, C++, C#, or F#?

Ответить

Dennis Bautembach - 28.07.2022 05:26

Does a jump always have to immediately follow a cmp? Or could you execute some other instructions in-between?

Ответить

zeyogoat - 28.07.2022 05:24

You've been teaching this chem teacher to code for years now. Cheers! One question I have: How would you sort four numbers in asm?

Ответить

Dennis Bautembach - 28.07.2022 05:18

Damn mate it's nice to have you back but you put on some weight. Please don't let it get any worse!

Ответить

Anders Juel Jensen - 28.07.2022 05:09

This was really interesting. I've ever only come across assembly in the Linux kernel's architecture dependant code. It looks like you need to be a certain kind of masochist to enjoy the challenge of writing actual problem solving code in assembly... I should give it a try...

Ответить