Topic: APLX Help : Help on APL language : System Functions & Variables : ⎕PROFILE Performance Profiling
[ Previous | Next | Contents | Index | APL Home ]

⎕PROFILE Performance Profiling

Performance profiling can be used to find out which parts of your APL code take the most time to execute, or are executed most often, and so helps you to determine which functions to concentrate on when optimising performance.

For simple profiling it is not necessary to use ⎕PROFILE. Instead you can enable profiling through the APLX Tools menu and then run the code to be profiled. When the code completes and APLX returns to desktop calculator mode, the profile is automatically shown in a Profile window.

For more control over the profiling process you can use the ⎕PROFILE system function described here.


⎕PROFILE is monadic. The right argument is usually a nested vector, the first element of which is a keyword (such as 'on' or 'data'), and the remaining elements of which are parameters specific to the operation being carried out. (For certain operations, which take no parameters, a simple character vector argument of just the keyword can be supplied.) Keywords are case-insensitive.

Turning profiling on

⎕PROFILE 'on' [method]

To turn on profiling you use the 'on' keyword. This causes any previous profiling data to be discarded, and a new profiling session is started. The optional 'method' parameter specifies which of the following profiling types to use:

CodeProfiling method
1Measure time in CPU cycles used by APLX application
2Measure time in CPU cycles used by APL task
3Measure time used by APLX application
4Measure time used by APL task
5Measure elapsed time
0Use the first method supported by the OS (default)

Depending on which platform you are using, one or more of the timing methods may not be available. For example, earlier versions of Windows cannot measure the number of CPU cycles used by an application. If the method specified is not available a DOMAIN ERROR occurs.

Measuring CPU cycles (method 1 or 2) usually gives the most accurate results, because the CPU count is updated continuously. If this method is not available you can fall back on methods 3 and 4, which make use of a low-level timer provided by the operating system. This may be less accurate: under Windows the timer value is only updated each time a thread reaches the end of its time slice, so that a number of APL lines may execute for each tick of the timer.

In most implementations, APLX uses multiple process threads. There is typically one thread for each APL session in progress, one for each additional APL child task started under program control, and one shared thread to handle user interaction via the GUI. Depending on how your application is structured you might choose the following:

  • For most applications it is best to measure the time taken by the whole APLX application (method 1 or 3). This will provide a more accurate reflection of the cost of executing each line of APL code because it includes any time used by the GUI thread - for example to handle any drawing operations that the line performs.
  • For applications where you start additional tasks under APL control (or if you have multiple APL sessions executing simultaneously), choose method 2 or 4. This avoids wrongly charging the time taken in the other APL tasks to the current profile.
  • Measuring the elapsed time can also return useful information; for example it can help you to find where time is spent by APL waiting for network operations to complete or executing ⎕DL.

Controlling Profiling

⎕PROFILE 'pause'
⎕PROFILE 'resume'
⎕PROFILE 'reset'
r←⎕PROFILE 'state'

There is a small performance penalty when running APL code with profiling turned on, so you may wish to suspend profiling temporarily. You can do this using the 'pause' and 'resume' keywords.

To end profiling completely and discard all profiling data, use the 'reset' keyword. Profiling is also ended by )CLEAR or by loading a new workspace.

To determine the current profiling state use the 'state' keyword. This returns a five-element numeric vector as follows:

  • [1]  State: 0 if profiling off, 1 if on, 2 if paused, 3 if aborted because of e.g. insufficient memory
  • [2]  Method: Profiling method currently being used (See 'new' keyword)
  • [3]  Tick Period: For time-based profiling methods, this contains the period of the timer tick in nanoseconds (0 if unknown)
  • [4]  Resolution: The approximate resolution of the timer in ticks, or 0 if not known.
  • [5]  Total: The total time covered by the profiling data, in timer ticks

The Tick Period and Resolution values may only be approximate, depending on the capabilities of the underlying operating system. For example calls to measure thread times under Windows use the QueryThreadCycleTime method. This returns results in multiples of 100 nanoseconds (the tick period), but Windows only increments the thread time at the end of each time slice so the resolution is poor. You should use measurements in CPU cycles for greater accuracy if your version of Windows supports this

Viewing the profile data

⎕PROFILE 'show'
⎕PROFILE 'save' filename [detail]
r←⎕PROFILE 'data' [functions]

Profiling results can be viewed at any time while profiling is in progress or is paused. If you wish to perform cumulative profiling over several runs you can do so, because time spent in desk calculator mode is not recorded. Previous results are only discarded if you start a new profiling session, clear the workspace or load a new one, or if you explicitly discard them using the 'reset' keyword.

The easiest way to view the results is to use the 'show' keyword, which will cause a new Performance Profile window to be displayed. You can use this to explore the data in a number of ways, for example to find out where most time was spent or which functions were called most often.

To save the results as a file in HTML format, use the 'save' keyword. This takes a character vector containing the name of the file to create, which can be a full pathname or just a file name in the current directory. If you supply an empty vector, a dialog is displayed allowing the user to select a file.

Because the profiling information can be quite large, a second parameter to 'save' allows you to control the level of detail written to the HTML file. The values are:

  • 0 - Write summary information only (default). This includes only the functions and lines which contibute most to the time taken
  • 1 - Write detailed information which includes every function which executed during profiling

To obtain the profiling data as an APL array you can use the 'data' keyword. This returns a multi-row, 8 column nested array of profiling data, ordered by function and line number. The columns are as follows:

  • [;1]  Function name
  • [;2]  Line number within function
  • [;3]  Number of times line was executed
  • [;4]  Total time spent in the line itself
  • [;5]  Total time spent in the line and any functions it calls (its children).
  • [;6]  Average time taken to execute the line, excluding children
  • [;7]  Maximum time taken to execute the line, excluding children
  • [;8]  Minimum time taken to execute the line, excluding children

In the case of recursive functions, the time spent back in a function line is included in the 'self' figure, not in the figure for the line and its children.

You can restrict the data to one or more specified functions by supplying the function name(s), for example: ⎕PROFILE 'data' 'DRAW' 'UPDATE'

Topic: APLX Help : Help on APL language : System Functions & Variables : ⎕PROFILE Performance Profiling
[ Previous | Next | Contents | Index | APL Home ]