From 312cabc3f7ccaa4a7011260b8f9484d0ee64f6ac Mon Sep 17 00:00:00 2001 From: Kevin O'Connor Date: Thu, 28 Mar 2019 09:47:10 -0400 Subject: [PATCH] docs: Move benchmark information from Debugging.md to new Benchmarks.md Signed-off-by: Kevin O'Connor --- docs/Benchmarks.md | 288 +++++++++++++++++++++++++++++++++++++++++++++ docs/Debugging.md | 288 +-------------------------------------------- docs/Features.md | 3 +- docs/Overview.md | 3 +- 4 files changed, 293 insertions(+), 289 deletions(-) create mode 100644 docs/Benchmarks.md diff --git a/docs/Benchmarks.md b/docs/Benchmarks.md new file mode 100644 index 00000000..918489ca --- /dev/null +++ b/docs/Benchmarks.md @@ -0,0 +1,288 @@ +This document describes Klipper benchmarks. + +Micro-controller Benchmarks +=========================== + +This section describes the mechanism used to generate the Klipper +micro-controller step rate benchmarks. + +The primary goal of the benchmarks is to provide a consistent +mechanism for measuring the impact of coding changes within the +software. A secondary goal is to provide high-level metrics for +comparing the performance between chips and between software +platforms. + +The step rate benchmark is designed to find the maximum stepping rate +that the hardware and software can reach. This benchmark stepping rate +is not achievable in day-to-day use as Klipper needs to perform other +tasks (eg, mcu/host communication, temperature reading, endstop +checking) in any real-world usage. + +In general, the pins for the benchmark tests are chosen to flash LEDs +or other innocuous pins. **Always verify that it is safe to drive the +configured pins prior to running a benchmark.** It is not recommended +to drive an actual stepper during a benchmark. + +## Step rate benchmark test ## + +The test is performed using the console.py tool (described in +[Debugging.md](Debugging.md)). The micro-controller is configured for +the particular hardware platform (see below) and then the following is +cut-and-paste into the console.py terminal window: +``` +SET start_clock {clock+freq} +SET ticks 1000 + +reset_step_clock oid=0 clock={start_clock} +set_next_step_dir oid=0 dir=0 +queue_step oid=0 interval={ticks} count=60000 add=0 +set_next_step_dir oid=0 dir=1 +queue_step oid=0 interval=3000 count=1 add=0 + +reset_step_clock oid=1 clock={start_clock} +set_next_step_dir oid=1 dir=0 +queue_step oid=1 interval={ticks} count=60000 add=0 +set_next_step_dir oid=1 dir=1 +queue_step oid=1 interval=3000 count=1 add=0 + +reset_step_clock oid=2 clock={start_clock} +set_next_step_dir oid=2 dir=0 +queue_step oid=2 interval={ticks} count=60000 add=0 +set_next_step_dir oid=2 dir=1 +queue_step oid=2 interval=3000 count=1 add=0 +``` + +The above tests three steppers simultaneously stepping. If running the +above results in a "Rescheduled timer in the past" or "Stepper too far +in past" error then it indicates the `ticks` parameter is too low (it +results in a stepping rate that is too fast). The goal is to find the +lowest setting of the ticks parameter that reliably results in a +successful completion of the test. It should be possible to bisect the +ticks parameter until a stable value is found. + +On a failure, one can copy-and-paste the following to clear the error +in preparation for the next test: +``` +clear_shutdown +``` + +To obtain the single stepper and dual stepper benchmarks, the same +configuration sequence is used, but only the first block (for the +single stepper case) or first two blocks (for the dual stepper case) +of the above test is cut-and-paste into the console.py window. + +To produce the benchmarks found in the Features.md document, the total +number of steps per second is calculated by multiplying the number of +active steppers with the nominal mcu frequency and dividing by the +final ticks parameter. The results are rounded to the nearest K. For +example, with three active steppers: +``` +ECHO Test result is: {"%.0fK" % (3. * freq / ticks / 1000.)} +``` + +### AVR step rate benchmark ### + +The following configuration sequence is used on AVR chips: +``` +PINS arduino +allocate_oids count=3 +config_stepper oid=0 step_pin=ar29 dir_pin=ar28 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=ar27 dir_pin=ar26 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=ar23 dir_pin=ar22 min_stop_interval=0 invert_step=0 +finalize_config crc=0 +``` + +The test was last run on commit `b161a69e` with gcc version `avr-gcc +(GCC) 4.8.1`. Both the 16Mhz and 20Mhz tests were run using simulavr +configured for an atmega644p (previous tests have confirmed simulavr +results match tests on both a 16Mhz at90usb and a 16Mhz atmega2560). +On both 16Mhz and 20Mhz the best single stepper result is `SET ticks +106`, the best dual stepper result is `SET ticks 276`, and the best +three stepper result is `SET ticks 481`. + +### Arduino Due step rate benchmark ### + +The following configuration sequence is used on the Due: +``` +allocate_oids count=3 +config_stepper oid=0 step_pin=PB27 dir_pin=PA21 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=PB26 dir_pin=PC30 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=PA21 dir_pin=PC30 min_stop_interval=0 invert_step=0 +finalize_config crc=0 +``` + +The test was last run on commit `74c21654` with gcc version +`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. The best single +stepper result is `SET ticks 388`, the best dual stepper result is +`SET ticks 405`, and the best three stepper result is `SET ticks 576`. + +### Duet Maestro step rate benchmark ### + +The following configuration sequence is used on the Duet Maestro: +``` +allocate_oids count=3 +config_stepper oid=0 step_pin=PC26 dir_pin=PC18 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=PC26 dir_pin=PA8 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=PC26 dir_pin=PB4 min_stop_interval=0 invert_step=0 +finalize_config crc=0 +``` + +The test was last run on commit `74c21654` with gcc version +`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. The best single +stepper result is `SET ticks 553`, the best dual stepper result is +`SET ticks 563`, and the best three stepper result is `SET ticks 623`. + +### Duet Wifi step rate benchmark ### + +The following configuration sequence is used on the Duet Wifi: +``` +allocate_oids count=4 +config_stepper oid=0 step_pin=PD6 dir_pin=PD11 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=PD7 dir_pin=PD12 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=PD8 dir_pin=PD13 min_stop_interval=0 invert_step=0 +config_stepper oid=3 step_pin=PD5 dir_pin=PA1 min_stop_interval=0 invert_step=0 +finalize_config crc=0 + +``` + +The test was last run on commit `59a60d68` with gcc version +`arm-none-eabi-gcc 7.3.1 20180622 (release) +[ARM/embedded-7-branch revision 261907]`. The best single stepper +result is `SET ticks 519`, the best dual stepper result is `SET ticks +520`, and the best three stepper result is `SET ticks 525`, and the +best four stepper result is `SET ticks 703`. + +### Beaglebone PRU step rate benchmark ### + +The following configuration sequence is used on the PRU: +``` +PINS beaglebone +allocate_oids count=3 +config_stepper oid=0 step_pin=P8_13 dir_pin=P8_12 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=P8_15 dir_pin=P8_14 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=P8_19 dir_pin=P8_18 min_stop_interval=0 invert_step=0 +finalize_config crc=0 +``` + +The test was last run on commit `b161a69e` with gcc version `pru-gcc +(GCC) 8.0.0 20170530 (experimental)`. The best single stepper result +is `SET ticks 861`, the best dual stepper result is `SET ticks 853`, +and the best three stepper result is `SET ticks 883`. + +### STM32F103 step rate benchmark ### + +The following configuration sequence is used on the STM32F103: +``` +allocate_oids count=3 +config_stepper oid=0 step_pin=PC13 dir_pin=PB5 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=PB3 dir_pin=PB6 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=PA4 dir_pin=PB7 min_stop_interval=0 invert_step=0 +finalize_config crc=0 +``` + +The test was last run on commit `9f3517fd` with gcc version +`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. The best single +stepper result is `SET ticks 345`, the best dual stepper result is +`SET ticks 365`, and the best three stepper result is `SET ticks 606`. + +### LPC176x step rate benchmark ### + +The following configuration sequence is used on the LPC176x: +``` +allocate_oids count=3 +config_stepper oid=0 step_pin=P1.20 dir_pin=P1.18 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=P1.21 dir_pin=P1.18 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=P1.23 dir_pin=P1.18 min_stop_interval=0 invert_step=0 +finalize_config crc=0 +``` + +The test was last run on commit `9f3517fd` with gcc version +`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. For the 100Mhz +LPC1768, the best single stepper result is `SET ticks 448`, the best +dual stepper result is `SET ticks 450`, and the best three stepper +result is `SET ticks 523`. The 120Mhz LPC1769 results were obtained by +overclocking an LPC1768 to 120Mhz - the best single stepper result is +`SET ticks 525`, the best dual stepper result is `SET ticks 526`, and +the best three stepper result is `SET ticks 545`. + +### SAMD21 step rate benchmark ### + +The following configuration sequence is used on the SAMD21: +``` +allocate_oids count=3 +config_stepper oid=0 step_pin=PA27 dir_pin=PA20 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=PB3 dir_pin=PA21 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=PA17 dir_pin=PA21 min_stop_interval=0 invert_step=0 +finalize_config crc=0 +``` + +The test was last run on commit `9f3517fd` with gcc version +`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. The best single +stepper result is `SET ticks 277`, the best dual stepper result is +`SET ticks 410`, and the best three stepper result is `SET ticks 664`. + +### SAMD51 step rate benchmark ### + +The following configuration sequence is used on the SAMD51: +``` +allocate_oids count=3 +config_stepper oid=0 step_pin=PA22 dir_pin=PA20 min_stop_interval=0 invert_step=0 +config_stepper oid=1 step_pin=PA22 dir_pin=PA21 min_stop_interval=0 invert_step=0 +config_stepper oid=2 step_pin=PA22 dir_pin=PA19 min_stop_interval=0 invert_step=0 +config_stepper oid=3 step_pin=PA22 dir_pin=PA18 min_stop_interval=0 invert_step=0 +finalize_config crc=0 +``` + +The test was last run on commit `9f3517fd` with gcc version +`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0` on a SAMD51G19A +micro-controller. The best single stepper result is `SET ticks 516`, +the best dual stepper result is `SET ticks 520`, the best three +stepper result is `SET ticks 519`, and the best four stepper result is +`SET ticks 655`. + +## Command dispatch benchmark ## + +The command dispatch benchmark tests how many "dummy" commands the +micro-controller can process. It is primarily a test of the hardware +communication mechanism. The test is run using the console.py tool +(described in [Debugging.md](Debugging.md)). The following is +cut-and-paste into the console.py terminal window: +``` +DELAY {clock + 2*freq} get_uptime +FLOOD 100000 0.0 end_group +get_uptime +``` + +When the test completes, determine the difference between the clocks +reported in the two "uptime" response messages. The total number of +commands per second is then `100000 * mcu_frequency / clock_diff`. + +Note that this test may saturate the USB/CPU capacity of a Raspberry +Pi. The benchmarks below are with console.py running on a desktop +class machine. + +| MCU | Rate | Build | Build compiler | +| ------------------- | ---- | -------- | ------------------- | +| pru (shared memory) | 5K | b161a69e | pru-gcc (GCC) 8.0.0 20170530 (experimental) | +| atmega2560 (serial) | 23K | b161a69e | avr-gcc (GCC) 4.8.1 | +| sam3x8e (serial) | 23K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | +| at90usb1286 (USB) | 75K | b161a69e | avr-gcc (GCC) 4.8.1 | +| samd21 (USB) | 238K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | +| stm32f103 (USB) | 335K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | +| sam3x8e (USB) | 450K | a5aede52 | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | +| lpc1768 (USB) | 546K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | +| sam4s8c (USB) | 619K | a5aede52 | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | +| lpc1769 (USB) | 619K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | +| samd51 (USB) | 620K | 8cd83b4c | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | + +Host Benchmarks +=============== + +It is possible to run timing tests on the host software using the +"batch mode" processing mechanism (described in +[Debugging.md](Debugging.md)). This is typically done by choosing a +large and complex G-Code file and timing how long it takes for the +host software to process it. For example: +``` +time ~/klippy-env/bin/python ./klippy/klippy.py config/example.cfg -i something_complex.gcode -o /dev/null -d out/klipper.dict +``` diff --git a/docs/Debugging.md b/docs/Debugging.md index 6e2a2545..c4ac758d 100644 --- a/docs/Debugging.md +++ b/docs/Debugging.md @@ -1,4 +1,4 @@ -The Klippy host code has some tools to help in debugging. +This document describes some of the Klipper debugging tools. Translating gcode files to micro-controller commands ==================================================== @@ -171,289 +171,3 @@ The script will extract the printer config file and will extract MCU shutdown information. The information dumps from an MCU shutdown (if present) will be reordered by timestamp to assist in diagnosing cause and effect scenarios. - -Micro-controller Benchmarks -=========================== - -This section describes the mechanism used to generate the Klipper -micro-controller step rate benchmarks. - -The primary goal of the benchmarks is to provide a consistent -mechanism for measuring the impact of coding changes within the -software. A secondary goal is to provide high-level metrics for -comparing the performance between chips and between software -platforms. - -The step rate benchmark is designed to find the maximum stepping rate -that the hardware and software can reach. This benchmark stepping rate -is not achievable in day-to-day use as Klipper needs to perform other -tasks (eg, mcu/host communication, temperature reading, endstop -checking) in any real-world usage. - -In general, the pins for the benchmark tests are chosen to flash LEDs -or other innocuous pins. **Always verify that it is safe to drive the -configured pins prior to running a benchmark.** It is not recommended -to drive an actual stepper during a benchmark. - -## Step rate benchmark test ## - -The test is performed using the console.py tool (described above). The -micro-controller is configured for the particular hardware platform -(see below) and then the following is cut-and-paste into the -console.py terminal window: -``` -SET start_clock {clock+freq} -SET ticks 1000 - -reset_step_clock oid=0 clock={start_clock} -set_next_step_dir oid=0 dir=0 -queue_step oid=0 interval={ticks} count=60000 add=0 -set_next_step_dir oid=0 dir=1 -queue_step oid=0 interval=3000 count=1 add=0 - -reset_step_clock oid=1 clock={start_clock} -set_next_step_dir oid=1 dir=0 -queue_step oid=1 interval={ticks} count=60000 add=0 -set_next_step_dir oid=1 dir=1 -queue_step oid=1 interval=3000 count=1 add=0 - -reset_step_clock oid=2 clock={start_clock} -set_next_step_dir oid=2 dir=0 -queue_step oid=2 interval={ticks} count=60000 add=0 -set_next_step_dir oid=2 dir=1 -queue_step oid=2 interval=3000 count=1 add=0 -``` - -The above tests three steppers simultaneously stepping. If running the -above results in a "Rescheduled timer in the past" or "Stepper too far -in past" error then it indicates the `ticks` parameter is too low (it -results in a stepping rate that is too fast). The goal is to find the -lowest setting of the ticks parameter that reliably results in a -successful completion of the test. It should be possible to bisect the -ticks parameter until a stable value is found. - -On a failure, one can copy-and-paste the following to clear the error -in preparation for the next test: -``` -clear_shutdown -``` - -To obtain the single stepper and dual stepper benchmarks, the same -configuration sequence is used, but only the first block (for the -single stepper case) or first two blocks (for the dual stepper case) -of the above test is cut-and-paste into the console.py window. - -To produce the benchmarks found in the Features.md document, the total -number of steps per second is calculated by multiplying the number of -active steppers with the nominal mcu frequency and dividing by the -final ticks parameter. The results are rounded to the nearest K. For -example, with three active steppers: -``` -ECHO Test result is: {"%.0fK" % (3. * freq / ticks / 1000.)} -``` - -### AVR step rate benchmark ### - -The following configuration sequence is used on AVR chips: -``` -PINS arduino -allocate_oids count=3 -config_stepper oid=0 step_pin=ar29 dir_pin=ar28 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=ar27 dir_pin=ar26 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=ar23 dir_pin=ar22 min_stop_interval=0 invert_step=0 -finalize_config crc=0 -``` - -The test was last run on commit `b161a69e` with gcc version `avr-gcc -(GCC) 4.8.1`. Both the 16Mhz and 20Mhz tests were run using simulavr -configured for an atmega644p (previous tests have confirmed simulavr -results match tests on both a 16Mhz at90usb and a 16Mhz atmega2560). -On both 16Mhz and 20Mhz the best single stepper result is `SET ticks -106`, the best dual stepper result is `SET ticks 276`, and the best -three stepper result is `SET ticks 481`. - -### Arduino Due step rate benchmark ### - -The following configuration sequence is used on the Due: -``` -allocate_oids count=3 -config_stepper oid=0 step_pin=PB27 dir_pin=PA21 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=PB26 dir_pin=PC30 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=PA21 dir_pin=PC30 min_stop_interval=0 invert_step=0 -finalize_config crc=0 -``` - -The test was last run on commit `74c21654` with gcc version -`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. The best single -stepper result is `SET ticks 388`, the best dual stepper result is -`SET ticks 405`, and the best three stepper result is `SET ticks 576`. - -### Duet Maestro step rate benchmark ### - -The following configuration sequence is used on the Duet Maestro: -``` -allocate_oids count=3 -config_stepper oid=0 step_pin=PC26 dir_pin=PC18 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=PC26 dir_pin=PA8 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=PC26 dir_pin=PB4 min_stop_interval=0 invert_step=0 -finalize_config crc=0 -``` - -The test was last run on commit `74c21654` with gcc version -`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. The best single -stepper result is `SET ticks 553`, the best dual stepper result is -`SET ticks 563`, and the best three stepper result is `SET ticks 623`. - -### Duet Wifi step rate benchmark ### - -The following configuration sequence is used on the Duet Wifi: -``` -allocate_oids count=4 -config_stepper oid=0 step_pin=PD6 dir_pin=PD11 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=PD7 dir_pin=PD12 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=PD8 dir_pin=PD13 min_stop_interval=0 invert_step=0 -config_stepper oid=3 step_pin=PD5 dir_pin=PA1 min_stop_interval=0 invert_step=0 -finalize_config crc=0 - -``` - -The test was last run on commit `59a60d68` with gcc version -`arm-none-eabi-gcc 7.3.1 20180622 (release) -[ARM/embedded-7-branch revision 261907]`. The best single stepper -result is `SET ticks 519`, the best dual stepper result is `SET ticks -520`, and the best three stepper result is `SET ticks 525`, and the -best four stepper result is `SET ticks 703`. - -### Beaglebone PRU step rate benchmark ### - -The following configuration sequence is used on the PRU: -``` -PINS beaglebone -allocate_oids count=3 -config_stepper oid=0 step_pin=P8_13 dir_pin=P8_12 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=P8_15 dir_pin=P8_14 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=P8_19 dir_pin=P8_18 min_stop_interval=0 invert_step=0 -finalize_config crc=0 -``` - -The test was last run on commit `b161a69e` with gcc version `pru-gcc -(GCC) 8.0.0 20170530 (experimental)`. The best single stepper result -is `SET ticks 861`, the best dual stepper result is `SET ticks 853`, -and the best three stepper result is `SET ticks 883`. - -### STM32F103 step rate benchmark ### - -The following configuration sequence is used on the STM32F103: -``` -allocate_oids count=3 -config_stepper oid=0 step_pin=PC13 dir_pin=PB5 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=PB3 dir_pin=PB6 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=PA4 dir_pin=PB7 min_stop_interval=0 invert_step=0 -finalize_config crc=0 -``` - -The test was last run on commit `9f3517fd` with gcc version -`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. The best single -stepper result is `SET ticks 345`, the best dual stepper result is -`SET ticks 365`, and the best three stepper result is `SET ticks 606`. - -### LPC176x step rate benchmark ### - -The following configuration sequence is used on the LPC176x: -``` -allocate_oids count=3 -config_stepper oid=0 step_pin=P1.20 dir_pin=P1.18 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=P1.21 dir_pin=P1.18 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=P1.23 dir_pin=P1.18 min_stop_interval=0 invert_step=0 -finalize_config crc=0 -``` - -The test was last run on commit `9f3517fd` with gcc version -`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. For the 100Mhz -LPC1768, the best single stepper result is `SET ticks 448`, the best -dual stepper result is `SET ticks 450`, and the best three stepper -result is `SET ticks 523`. The 120Mhz LPC1769 results were obtained by -overclocking an LPC1768 to 120Mhz - the best single stepper result is -`SET ticks 525`, the best dual stepper result is `SET ticks 526`, and -the best three stepper result is `SET ticks 545`. - -### SAMD21 step rate benchmark ### - -The following configuration sequence is used on the SAMD21: -``` -allocate_oids count=3 -config_stepper oid=0 step_pin=PA27 dir_pin=PA20 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=PB3 dir_pin=PA21 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=PA17 dir_pin=PA21 min_stop_interval=0 invert_step=0 -finalize_config crc=0 -``` - -The test was last run on commit `9f3517fd` with gcc version -`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0`. The best single -stepper result is `SET ticks 277`, the best dual stepper result is -`SET ticks 410`, and the best three stepper result is `SET ticks 664`. - -### SAMD51 step rate benchmark ### - -The following configuration sequence is used on the SAMD51: -``` -allocate_oids count=3 -config_stepper oid=0 step_pin=PA22 dir_pin=PA20 min_stop_interval=0 invert_step=0 -config_stepper oid=1 step_pin=PA22 dir_pin=PA21 min_stop_interval=0 invert_step=0 -config_stepper oid=2 step_pin=PA22 dir_pin=PA19 min_stop_interval=0 invert_step=0 -config_stepper oid=3 step_pin=PA22 dir_pin=PA18 min_stop_interval=0 invert_step=0 -finalize_config crc=0 -``` - -The test was last run on commit `9f3517fd` with gcc version -`arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0` on a SAMD51G19A -micro-controller. The best single stepper result is `SET ticks 516`, -the best dual stepper result is `SET ticks 520`, the best three -stepper result is `SET ticks 519`, and the best four stepper result is -`SET ticks 655`. - -## Command dispatch benchmark ## - -The command dispatch benchmark tests how many "dummy" commands the -micro-controller can process. It is primarily a test of the hardware -communication mechanism. The test is run using the console.py tool -(described above). The following is cut-and-paste into the console.py -terminal window: -``` -DELAY {clock + 2*freq} get_uptime -FLOOD 100000 0.0 end_group -get_uptime -``` - -When the test completes, determine the difference between the clocks -reported in the two "uptime" response messages. The total number of -commands per second is then `100000 * mcu_frequency / clock_diff`. - -Note that this test may saturate the USB/CPU capacity of a Raspberry -Pi. The benchmarks below are with console.py running on a desktop -class machine. - -| MCU | Rate | Build | Build compiler | -| ------------------- | ---- | -------- | ------------------- | -| pru (shared memory) | 5K | b161a69e | pru-gcc (GCC) 8.0.0 20170530 (experimental) | -| atmega2560 (serial) | 23K | b161a69e | avr-gcc (GCC) 4.8.1 | -| sam3x8e (serial) | 23K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | -| at90usb1286 (USB) | 75K | b161a69e | avr-gcc (GCC) 4.8.1 | -| samd21 (USB) | 238K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | -| stm32f103 (USB) | 335K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | -| sam3x8e (USB) | 450K | a5aede52 | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | -| lpc1768 (USB) | 546K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | -| sam4s8c (USB) | 619K | a5aede52 | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | -| lpc1769 (USB) | 619K | b161a69e | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | -| samd51 (USB) | 620K | 8cd83b4c | arm-none-eabi-gcc (Fedora 7.1.0-5.fc27) 7.1.0 | - -Host Benchmarks -=============== - -It is possible to run timing tests on the host software using the -"batch mode" processing mechanism described above. This is typically -done by choosing a large and complex G-Code file and timing how long -it takes for the host software to process it. For example: -``` -time ~/klippy-env/bin/python ./klippy/klippy.py config/example.cfg -i something_complex.gcode -o /dev/null -d out/klipper.dict -``` diff --git a/docs/Features.md b/docs/Features.md index 43a0cf89..f5facb9a 100644 --- a/docs/Features.md +++ b/docs/Features.md @@ -146,4 +146,5 @@ stepper stepping. On the SAMD21 and STM32F103 the highest step rate is with two simultaneous steppers stepping. On the SAM3X8E, SAM4S8C, SAM4E8E, LPC176x, and PRU the highest step rate is with three simultaneous steppers. On the SAMD51, the highest step rate is with -four simultaneous steppers. +four simultaneous steppers. (Further details on the benchmarks are +available in the [Benchmarks document](Benchmarks.md).) diff --git a/docs/Overview.md b/docs/Overview.md index 5d1e4136..7463741d 100644 --- a/docs/Overview.md +++ b/docs/Overview.md @@ -46,6 +46,7 @@ protocol between host and micro-controller. See also commands implemented in the micro-controller software. See [debugging](Debugging.md) for information on how to test and debug -Klipper. See [stm32f1](stm32f1.md) for information on the STM32F1 +Klipper. See [benchmarks](Benchmarks.md) for information on +benchmarking. See [stm32f1](stm32f1.md) for information on the STM32F1 micro-controller port. See [bootloaders](Bootloaders.md) for developer information on micro-controller flashing.