Efficiency of OpenOCD for fault injection on Pandaboard ES

Hi there,

i am utilizing OpenOCD as a library for a fault injection project on a Pandaboard ES.

I need to

  • - Navigate to a dynamic instruction (specific instruction in trace), where the fault injection should take place
  • - Modify RAM and Registers
  • To do this, i have written code, which uses OpenOCD as a library to be most efficient.

    For the navigational task, i am using breakpoints (later also watchpoints), to find the specific point in time, where i will inject a fault.

    A simple scenario is as follows:

    The trace consists of only one repeated instruction A and i need to get to the 100th execution of A. So i use a breakpoint and count up everytime it hits.

    The code looks like this:

    // Find target
    struct target *target;
    target = get_target("omap4460.cpu");
    
    if (!target) {
        LOG_USER("Target not found!\n");
        return 1;
    }
    
    // Reset target
    if (reset(cmd_ctx, RESET_RUN)) {  // copy of static int target_process_reset(struct command_context *cmd_ctx, enum target_reset_mode reset_mode) [target.c]
        LOG_USER("Error while resetting target");
        return 1;
    }
    
    if (target_halt(target)) {
        LOG_USER("Target could not be haltet\n");
        return 1;
    }
    
    if (target_poll(target)) {
        LOG_USER("Error polling after halt!\n");
        return 1;
    }
    
    if (breakpoint_add(target, 2197816436, 4, BKPT_HARD)) {
        LOG_USER("Breakpoint not set!\n");
        return 1;
    }
    
    int i;
    for (i = 0; i < 100; i++) {
        if (target_resume(target, 1, 0, 1, 0)) {
            LOG_USER("Target could not be resumed!\n");
            return 1;
        }
    
        // wait for breakpoint
        while (1) {
            if (target_poll(target)) {
                LOG_USER("Error polling after resume!\n");
                return 1;
            }
            if (target->debug_reason == DBG_REASON_BREAKPOINT) {
                break;
            }
        }
        if (target_step(target, 1, 0, 1)) {
            LOG_USER("Stepping error!\n");
            return 1;
        }
    }
    

    I have two problems with this code.

    First of all: I can’t do it without the target_step, because parameter 4 (handle_breakpoints) of target_resume seems not to work for Pandaboard. If i don’t make a step, it doesn’t continue after resume, but ends up in the breakpoint again.

    Also this is very slow. I measured the time for one stop-and-go-execution and it is 246ms.

    After investigating the execution with callgrind, i saw, most of the time is spent in target_step.

    Is there any way to get the resume working on Pandaboard? Am i doing something wrong fundamentally?

    Thank you in advance

    Lars

    Hi,

    as i dived deeper into the code, i found that the step-procedure for Cortex-A9 (cortex_a.c - cortex_a8_step(…)) works as follows:

  • 1 Removing the current BP (which haltet the system)

    2 Adding a BP on the following instruction

    3 Resuming the execution

    4 Readding the original BP (from step 1)

    5 Removing the helper-BP (from step 2)


  • This looks somehow inefficient, but i assume, it is necessary. If so, i guess i won’t be able to speed up the whole procedure.

    As a cheap hack, i modified my code to instead of step+resume just resume at the next address. This makes the execution faster, as i only need 115ms for one stop-and-go-cycle, but this isn’t correct any more, because the BP-Instruction isn’t being executed.

    I thought, this would achieve a much higher speedup than factor 2. So, am i overlooking something real slow in the whole system?

    Best regards

    Lars

    Hi,

    it seems, that the call to target_poll is consuming quite a long time.

    Is there a cheaper way to find out, that the system is halted in a breakpoint?

    I tried this code snippet from cortex_a8_poll(struct target *target), but i am getting a segfault:

    uint32_t dscr;
    struct cortex_a8_common *cortex_a8 = target_to_cortex_a8(target);
    struct armv7a_common *armv7a = &cortex_a8->armv7a_common;
    struct adiv5_dap *swjdp = armv7a->arm.dap;
    mem_ap_sel_read_atomic_u32(swjdp, armv7a->debug_ap,
            armv7a->debug_base + CPUDBG_DSCR, &dscr);
    
    if(DSCR_RUN_MODE(dscr) == DSCR_CORE_HALTED) {
        // handle breakpoint
    }
    

    Best regards

    Lars