WCET analysis for multiprocessor based on real-time Java processor

(1)

V N U Journal o f S cience, M a th e m a tic s - Physics 27 (2011) 101-110

WCET analysis for multiprocessor based on real-time Java processor

Quang-Dung Vu% Viet-Ha Nguyen

V N U U n iv e r s ity o f E n g in e e r in '^ a n d T echnology^ 1 44 X u a n Thuy, C a u G iay, H a n o i, V ietnam

R c c c i v e d 11 F c b u a r y 2 0 ] 1

A bstract. In this paper, vve propose a solution for a w c r s l- c a s e e x e c u tio n time (WCET) analyzable Java system - a combination of a time predictable Java processor and a method WCET analysis at Java bytecode level. The execution time o f bytecodes, the instructions of the Java virtual machine, is known cycle accurately for Java processor, which simplifies the low-level WCET analysis [1 .

In hard real-time systems, the estimation of the WCET is essential. WCET analysis is in general an undecidable problem. As concerning above, we propose some o f WCET analysis methods using control flow graph applied for high-level and low-level Java processor. Java bytecode generation has to follow stringent rules in order to pass the class file verification of the JVM. Those restrictions lead to an analysis friendly code; e.g. the stack size is known at each instruction. The control flow instructions are well defined. Branches are relative and the destination is within the same method. Detection of basic blocks in Java bytecode and construction o f the control flow graph (CFG) is thus straight forward.

Key-work: Java processor, Java virtual machine, control flow graph, Java bytecode, worst case execution time, best case execution time, Integer linear programming (ILP).

Ỉ. Introduction

W C E T p lay s an im p o rtan t role in verify in g Jav a bytecode in rea l-tim e constraints. W C E T an aly sis is a w e ll e sta b lish e d research area. H ow ever, th ere is still a gap b etw een the theoretical fin d in g s and th e p rac tic a l u sage o f W C E T an alysis tools. W C E T a n a ly sis is u su ally d iv id ed into h igh-level and lo w -le v e l te c h n iq u es. H igh-level W C E T analysis co n sid ers th e program stru c tu re by path analysis on th e co n tro l flow graph (C FG ). T h e low -level part is co n c ern e d w ith the ex ecu tio n tim e o f m achine in stru ctio n s or instru ctio n sequences. Tli£ m ain issue for W C E T analysis m eth o d is th e g row ing c o m p le x ity o f n ew processors. It is alm o st im p o ssib le to m odel them for the low -level an alysis. We a re u sin g Ja v a processor, called JO P [2], d esigned for tim e-p red ic ta b le e x ecu tio n o f real-tim e tasks in stru c tio n cache.

T h e p a p e r is org an ized as follow s. R elated w ork is review ed in S ection 2. Section 3 presen ts o u r p ro posed re a l-tim e Jav a W C E T analysis. A n experim ental case is p rese n ted in Section 4 to g eth er w ith resu lts and d isc u ssio n .

* C orresponding author. Tel.: +84915087345 E-m ail: dungvq@ vnu.edu.vn

101

(2)

1 0 2 ^ (_) [)nníỵ. N V. H a / V N i ỉ J o u r n a l o f S c i e n c e . M a t h e m a t i c s - P h y s i c s 2 7 Í 2 0 I I ) 1 0 1 - N O

2. Related w o r k

In th is se c tio n , w e review several existing w orks th a t are rela te d to o u r p ro posed m ethod on W C E T a n a ly sis in real-tim e Java system s. Sun in troduced th e first v e rsio n o f p ico Ja v a [3] in 1997, H ow ever, th is p ro ce sso r w as never released as a p ro d u ct b y Sun. A red e sig n follow ed in 1999, know n as p ico Ja v a -II, w hich w as freely available. T h e a rc h ite c tu re o f p ic o Ja v a is a sta c k -b ase d C ISC p rocesso r im p le m e n tin g 341 d ifferent instructions. It is th e m ost c o m p le x J a v a p ro ce sso r available.

T h e p ro cesso r can be im p lem en ted in 2 7 ,6K logic c ells in an FPG A as sh o w n in [4 .

Shaw p re se n ts in [5] tim in g schem as to calc u la te m in im u m an d m a x im u m e x ecu tio n tim e for com m on lan g u ag e co n stru cts. T h e rules allow to c o lla p se th e a b stra c t s y n ta x tre e o f a program until a final single v alu e rep resen ts th e W C ET. H ow ever, w ith th is ap p ro a c h it is not stra ig h t forw ard to incorporate g lo b al low -level a ttrib u tes, such as pip elin es or caches. T h e re s u ltin g b o u n d s are not tight enough to be p ra c tic a lly useful.

C o m p u tin g th e W C E T w ith an integer lin e a r p ro g ra m m in g (IL P ) s o lv e r is proposed in [G a n d [7]. We b a se o u r W C E T analy zer on the ideas from th ese tw o gro u p s. T h e W C E T is c a lcu lated by iran sfo rm in g th e c a lc u la tio n to an integer linear pro g ram m in g p roblem . E ach b a sic b lo ck is represen ted by an edge in th e T -g ra p h (tim in g graph) w ith the w eig h t o f th e e x e c u tio n tim e o f th e basic block.

V ertices in th e graph rep resent the split and jo in p o in ts in th e co n tro l flow . F u rth erm o re, each edge is also assig n e d an ex ecu tio n frequency. T he co n stra in ts resu ltin g from th e T -g ra p h and ad d itio n al functional c o n stra in ts (e.g., loop b o unds) are solved by an ILP solver. T h e T -g ra p h is sim ila r to a C F G , w here th e ex ecu tio n tim e is m odelled in the vertices. T h e m o tiv a tio n to m odel the ex ecu tio n tim e in the ed g es resu lts from the observation that m ost b a sic blo ck s en d w ith a c o n d itio n a l branch.

3. Real-time Java W C E T analysis

In th is se ctio n , w e d e scrib e the m ethods o f W C E T an aly sis. T h e re are 2 m eth o d s of W C E l an aly sis - o n e is the basic m ethod in W C E T an alysis- im p lic it path e n u m e ra tio n te c h n iq u e (IP R T ), w h ich bases for hig h -lev el; the o th er is low -level W C E r an a ly sis, w h ic h in v o lv es to Java bytecode.

3 . 1. I P E T W C E T an a lysis m e th o d

IPE T m eth o d [6] is based for W C E T analysis, w h ich can find th e e x a ct W C E T in th ese sim p le cases b ecau se th e ex ecu tio n tim e o f each section o f co d e can be c o n s id e re d in d e p e n d e n tly o f all the others. T h a t is a h ig h -lev el W C E T analysis. In IPET, llie program is m o d e le d as a flo w n etw o rk , and in teg er lin ear p ro g ra m m in g (IL P ) is used to d eterm in e th e ex ecu tio n p a th w ith th e m axim al ex ecu tio n tim e. T h e c a lc u la tio n o f th e W C E T is transform ed to an ILP p ro b le m . In the C F G , each vertex rep resen ts a b a sic b lo ck w ith execution tim e Cị. W ith th e b a sic b lo c k e x e c u tio n fre q u e n c y Cj th e W C E T is:

N W C E T = m a x ^ Cj Cj

i=i

T h e sum is th e o b jec tiv e fu nction for the ILP problem . T h e m ax im u m v a lu e o f th is ex p ressio n resu lts in th e W C E T o f th e program . F urtherm ore, each edge is also assig n e d an e x e c u tio n fre q u e n c y f. T h e se ex ecu tio n fre q u e n c ie s rep resen t the control flow th rough the W C E T p a th . T w o p rim a ry c o n stra in ts form the IL P problem : (i) F or each vertex, the sum o f f j for the in c o m in g e d g e s has to be eq ual to

(3)

the sum o f th e f k o f th e o u tg o in g edges; (ii) th e frequency o f the back edges c o n n e ctin g th e loop b o d y w ith the loop head er, is less th an or equal to th e freq u en cy o f the edges en te rin g th e loop m u ltip lie d by the loop bound.

From th e C F G , w h ic h rep re sen ts th e program stm c tu re , w e can extract th e flo w c o n stra in ts.W ith th e execution fre q u e n c y f o f th e edges and th e tw o sets h for th e incom ing edges to b a s ic block D^

and for th e o u tg o in g e d g e s, the ex ecu tio n freq u en cy ei o f B ị is:

V.Q. D u n g N .v. H a / VNU J o u rn a l o f Science, M a th em a tics - P h ysics 2 7 (2011) l O I - l I O 10 3

=< = E / . = E a

j e h k e o .

T h e frequencies f i are th e in te g e r variab les c a lcu lated by so lv in g th e ILP problem . A d d itio n a l c o n straints are n e e d ed to h a n d le loops. T h e in co m in g edges to th e en try p o in t o f th e loop, th e loop h e a d e r h, are classified as fo llo w s:

1. T h e set o f in c o m in g ed ges E k th a t e n te r the loop 2. T h e set o f b a c k e d g e s Cii th at close th e loop

W ith th e m a x im u m lo o p count (the loop bou n d ) n, w e fo rm ulate th e loop c o n s tra in t as

jeCh. k^Efi

W ithout fu rth e r g lo b al c o n s tra in ts , th e p ro b le m can be solved locally for each m ethod. W e start a t th e leaves o f th e call tre e a n d c a lc u la te th e W C E T for each m ethod. T h e W C E T v a lu e o f a m ethod is included in th e in v o k e in stru c tio n o f th e c a lle r m ethod. To in co rp o rate global c o n stra in ts, su ch as ca ch e co n strain ts [8], a s in g le IL P p ro b lem is b u ilt th a t m odels th e w h o le program . T h e in v o k e in stru ctio n i is connected to th e e n try a n d exit n ode o f th e invoked m ethod, a d d in g ed ges c and r, respectively.

To ensure th a t o n ly v a lid p a th s are c o n sid e red , one ad d itio n al c o n stra in t is n e e d ed fo r each m eth o d invocation:

f i — f c = f r

j e h

T h e exam ple o f IP E T W C E T an aly sis m ethod could be sh o w n in F ig u re ? ? . T h e b a sic b lo ck th a t co ntains th e in v o k e in stru c tio n is sp lit into th re e new blocks: T h e p rec e d in g in stru c tio n s, th e in v o k e instru ctio n , and fo llo w in g in stru ctio n s. C o n sid e r fo llow ing b asic block:

- iload 1 - iload 2 - aload 0

- in v o k ev irtu al fo o - istore 3

W hen differen t v e rs io n s o f fooQ are p o ssib le receiver m ethods, w e m odel th e in v o ca tio n o f fooQ as altern ativ es in th e g ra p h . F o llo w in g th e sta n d a rd ru les for th e in co m in g and o u tg o in g ed ges th e resu ltin g ILP c o n s tra in t fo r th is ex am p le is:

f l = Ỉ2⁺ /3

(4)

104 i (J D u n ợ . N . v . H a / l \ \ ' U J o u r n a l n f S c i ^ ’ncc. M ã í h c m a ĩ i c s - P h v s i c s 2 7 ( 2 0 Ị I ) I O I ~ Ị Ị ( ì

fl

Fig. 1. E xam ple on basic block for p o ssib ility o f IP E T W C E T a n a ly sis m ethod.

Ỉ.2. L o w -le v e l W C E T anơ ìysis m e ih o d

T h e lo w -level an alysis co ncerns to executing Java bytecode in sid e J a v a processor. T h e Figure show s ihe p ro ce ss for execution bytecode on the real JO P system s [2 .

Jav a pc

Ja v a bytecode

Jump table

JL>P microcode

Ỉ 1 - -skA ifC-nul

i?idd r.x-.

X . . _ iload_2

i d i V _ _ ỵ " ^

« i c i V

á l c i V ^-► J O P p c i CVlL . c-b r.xi

&ỄCÍV Ể«c<ii ^V

i c i v : GCr. b s t r . a Idr. r rjr.t Java inỉ^nvcUon

(e.g. 0x6c)

Ũtođoóớres^i ồ tĩd ĩv m JVM RO M

i r-im: ctr. b

■ ' ■ '

Fig. 2. Data flow for Java bytecode inside JOP.

For th e low -level W C E T an aly sis, a good m odel o f th e target a rc h ite c tu re is n e e d ed . In our case th e target a rc h ite c tu re is sim p le w ith respect to the W C E T and w ell d o c u m e n te d . F o llo w in g [9], the W C E T a n a ly s is is perform ed in the m icrocode that im plem ents th e b y te c o d e in stru c tio n s. T h at m eans the b y teco d e in stru c tio n tim in g is d erived by static analysis and no fu rth e r m e a s u re m e n ts are necessary.

M o st b y teco d e in stru ctio n s that do not access m em ory have a c o n s ta n t e x e c u tio n tim e. T hey are ex ecu ted by e ith e r one m icrocode in struction or a short seq u en c e o f m ic ro c o d e in stru ctio n s. T h e execution tim e in clock cycles equals the nu m b er o f m icro in stru ctio n s e x e c u te d . A s th e sta c k is onchip, it can be a c c e sse d in a sin g le cycle. We do not need to incorporate the m a in m em o ry tim in g into th e in struction tim in g o f sim p le bytecodes. T able show s ex am p le in stru c tio n s, th e ir tim in g , and th eir

(5)

KQ. D ung, N.v. H a / Vi^V Jou rn al o f Science, M ath em atics ' P h ysics 2 7 ( 20Ỉ Ỉ ) Ì OỈ - Ỉ Ỉ O 105

m eaning (T O S is to p -o f-sta c k ). A ccess to object, array, and class fields d ep en d on th e tim in g o f the m ain mcmor>'.

T a b e l 1. E xecution tim e o f sim ple bytecodes in cycles Instru ctio n C ycle Function

ico n st 0 1 load co n stan t 0 on TO S b ip u sh load a byte co nstant on T O S iload 0 1 load local variable 0 on T O S iload 2 load a local v ariable on T O S

d u p 1 d u p licate T O S

iad d 1 integer addition

isu b 1 integer subtraction

ifeq 4 conditional branch

O b je c t o rie n te d in stru c tio n s, array access, and invoke in stru ctio n s access th e m ain memory.

T here are 2 ty p e s o f m em o ry that contain th e ex ecu tin g bytecode in side Jav a p ro cesso r - m em ory read and m em ory w rite . T h e fo llo w in g Figure show s the picture o f c ach in g m em ory in Jav a processor.

Fig. 3. Memory stack caching.

A m eth o d c a c h e , w ith cach e fills only on invoke and return, does n o t interfere w ith d a ta access to the m ain m em ory. D a ta in th e m ain m em ory is accessed w ith g etfield and p u tfie ld , in stru ctio n s th at n e v e r o v e rla p w ith in v o k e and return. In traditional caches, d a ta access and in stru c tio n cache fill req u e sts c a n c o m p e te fo r th e m ain m em ory bus. For exam ple, a load or store a t th e e n d o f the p ro ce sso r p ip e lin e c o m p e te s w ith an in struction fetch th at resu lts in a ca ch e m iss. O n e o f th e tw o in stru ctio n s is s ta lle d fo r a d d itio n a l cycles by th e o ther in struction. W ith a d a ta c ach e, th is situation can be even w o rse . T h e w o rst-c a se scenario for th e m em ory stall tim e for an in stru c tio n fetch o r a d ata load is tw o m iss p e n a ltie s w hen both cach e reads are a m iss. T h is u n p re d ic ta b le b e h a v io r leads to very p e s s im is tic W C E T bo u n d s. A ccess tim e that exceeds a sin g le cy cle in clu d es ad d itio n a l w ait states { r ^ s ^ m e m o ry rea d and W-U)S for a m em ory w rite). W ith a m em ory w ith r ^ s w a it sta te s, the e x ecu tio n tim e for, e.g., g e tfie ld is:

i g e t f i e l d — 11 “t" 2t*u;5

(6)

106 K Q. Dung. N. V. H a / VNU Jou rn al o f Science, M ath em atics - P h ysics 2 7 ( 2 0 1 1) I OỈ - Ỉ Ỉ 0

T h e m eaning o f fo rm u la abo v e could be explained as fo llo w in g ,, th e tim e o f g e tfie ld com m and is taken about tw o tim es o f m em o ry read and about 11s on idlè fo r sy ste m d a ta transfer. A m em ory read in JO P m icrocode is sp lit into tw o phases: (1) start th e read tra n s a c tio n an d (2) read th e result.

To avoid tim in g d e p e n d en c ies w ith in th e m em ory su b sy stem o v er b y te c o d e b o u n d a rie s , m em ory store instructions are also sp lit into a sta rt w rite and w a it for co m p le tio n in stru c tio n . B etw een those tw o m icrocode in stru ctio n s th e m em ory subsystem perform s th e m em o ry tra n s a c tio n in p a ra lle l to the the core p ipeline ex ecu tin g m icro co d e instructions. F illin g th is slo t w ith u se fu l m ic ro c o d e in stru ctio n s can h ide some o f th e access latency. T h e m icrocode seq u en ce in th is lo a d /sto re slo t is straig h t line code and th e n u m b er o f h id d en cycles is constant. T h e follow ing e x a m p le gives th e e x a c t ex e cu tio n tim e o f bytecode ld c 2 w in clo c k cycles (the clo ck o f processor is a b o u t lOOM H z a n d S ta c k R A M speed for cycle - 15ns):

tidc2u, = 17 + m a x { r ^ s - 2, 0) + m a x ( w y j s - 1, 0)

M em ory a c c e ss tim e also d e te rm in e s the cache load tim e on a m iss. T h e c a c h e load tim e is calculated as fo llo v s - th e w a it sta te f w s for a single w ord cache fill is:

f w s ~ Ĩ TKl xị Tmsi 1)

The rea l system in te rac tio n b etw een Java ap p lication, sc h e d u le , J V M a n d h a rd w a re is show n on th e fo llow ing F ig u re .

Task 1 T0 B k 2 Scheduler J V M H a r d ’A'ar©

trr.et

5c*>«Julin(j daósion

iS-FNP

schedule

dispatch

C ontoK t

genlnt^

sef iinởĩĩưpỉ

3ch»dulino

decision

schedule?

dispatch

inĩtơưpt

T set rimer ¹

Cont«xi

SV¥ixh

I I Applicotion 1 ^ User defined Q Fran>eii.'ork

Fig, 4. Interact between Java application with schedule and hardware.

(7)

4. E x p e r im e n t

In th is sectio n w e c o n c lu d e w ith a w orst and best case an alysis o f a classic exam ple, the B ubble Sort algorithm . T h e Ja v a so u rce co d e is sh o w in g in below :

pu b lic class B u b b le

final static int N = 5; sta tic void sort(int[] a) int i, j , v l , v2;

// loop c o u n t = N-1 for ( i= N - l; i > 0; - i ) // loop c o u n t = (N - l) * N /2 for 0 = 1 ; ji=>; + + j)

v l = a [ j - l] ; v2 = a[j];

if (v l > v2) a u i = v l ; aU-1] = v 2 ;

} } } }

T h e a lg o rith m c o n ta in s tw o n ested loops and one co n dition. W e use an a ư a y o f five elem ents to perform th e m e a su re m e n ts for all p erm u tatio n s (i.e. 5! = 120) o f th e in p u t data. T h e nu m b er o f iterations o f th e o u te r loop is o n e less th an th e array size ^{C l} ^ N - 1, in th is case is four. T h e inner loop is ex ecu ted C2 = T , t u i = C i(c i + l ) / 2 tim es, i.e. ten tim es. T h e d isa sse m b le r o f bytecode is sh o w in g in th e F ig u re . T h e b y tec o d e is d ivided into som e o f code b lo ck s as sh o w n in th e Figure .

T h e a n n o ta te d c o n tro l flo w grap h (C FG ) o f the exam ple, w h ich is sh o w n in Figure , is a result a fter analysing. T h e e d g e s c o n ta in labels sh o w in g how often th e path b etw een tw o nodes is taken.

We can id en tify th e o u te r loop, c o n ta in in g th e blocks B 2, B 3, B 4 an d B 8. T h e in ner loop consists o f blo ck s B4 B5 B 6 a n d B 7. B lo ck B 6 is ex ecuted w h en th e co n d itio n o f th e i f statem en t is true. T h e path from B5 to B7 is th e o n ly p ath th at d ep en d s on th e input data. W ith u sin g a fo rm u la in section 3.2 w e can also c a lc u la te th e ex e cu tio n tim e on each edge.

T h e W C E T a n d B C E T v a lu e for each b lo ck is calcu lated by m u ltip ly in g th e clo ck cycles by the ex ecu tio n frequency. T h e overall W C E T and B C E T values are c a lcu lated by su m m in g th e values o f th e individual b lo c k s B1 to B 8. T h e last b lo ck (B 9) is o m itted , as th e m easu rem en t does not contain th e retu rn sta te m e n t. T h e ex e cu tio n tim e o f th e program is m easu red u sin g th e cycle c o u n te r m Java processor. T h e c u rre n t tim e is tak e n at both th e entry o f th e m ethod an d at th e end, resu ltin g in a m ea su rem e n t s p a n n in g from b lo c k B1 to th e b eg in n in g o f b lo ck B9. T h e last sta te m e n t, the return, is n o t p a rt o f th e m ea su rem e n t. T h e d ifference b etw een th ese tw o v alu es (less th e ad d itio n al 8 cycles in tro d u c ed by th e m e a su re m e n t itse lf) is given as th e ex ecu tio n tim e in clo ck cycles. M ost bytecodes h av e a sin g le e x e c u tio n tim e (W C E T = B C E T ), and th e W C E T o f a ta sk d ep en d s only on th e control flow . N o p ip e lin e o r d a ta d e p e n d e n c ie s com p licate th e low -level p a rt o f th e W C E T analysis. T h e re su lt is sh o w n in T a b le . T h e ex e cu tio n freq u en cy v alu e is d e p e n d ed on in p u t data. T h e result IS

V.Q. D ung, N .v. H a / VN U J o u rn a l o f Science, M ath em atics - P h ysics 2 7 (2 0 1 1) 1 0 1-110 107

(8)

108 í Ọ. D u ‘>'r. NA', l i a / V N i U o n r n a ỉ o f Science, M a t h e n u i i i i s - Physics 27 { 2 0 Ị h Ị ( ) I ~ Ị ỊD

Fig. 5. Java bubble disassembler.

bi

0: konst 4 1: ỉítOf« 1

B2

2: ikMd 1 3: ifit l i

8: ỉtoMÌ.2

9: ttoíHỉIl

10: 47

BS 13: tlosd 0 14; 101(1.2 IS: k xm slj.

16; isMb 17: iaioad H i Irt0ft,3 19: «kMd 0

20: tfoadlz 21: i«kMd

22: i«tor« « 24; io id i 25; UMd A 27: tf icmple 41

B6 30: aio9d_0 31: ik »d.2 32: ik>ad.3

ĨÌĨ ia»toct Ui aÌMd.o

3S: IkM d^

36: kortJtla 37: tsub Ì9: ilo»d 4

40;

B7 41: Bnc 2.1 44; goto 8

BS 47t iinc 1, -1

50; goto 2

Fig. 6. Java bubble disassembler block.

based on A lte ra board w ith 4 8 M H z C yclone processor, and 6 4 K B m em o ry c a c h e b y in stallin g Java processor.

(9)

K y . D u n g N .v. Ha / VNU Jou rn al o f Science, M a th em a tics - P h ysics 2 7 (2011) l OI - I I O 1 0 9

T able 2. T h e result o f ex perim ent N o d e E x ecu tio n tim e (c) W C E T = c*e (m s)

B 1B 2 1 B l= 2

B 2B 3 4 B 2=25

B 3 B 4 4 B3=8

B 4B 5 10 B 4= 80

B 5 B 6 7 B 5= 740 (D e p e n d in g on loop)

B 6B 7 6 B 6= 730 (D e p e n d in g on loop)

B 7 B 4 C2 10 B 7= 150

B 5B 7 5 D ep en d in g on loop

B 4B 8 4

B 8 B 2 C l 4 B 8= 60

B 2 B 9 1 return

5. Conclusion and Future w o r k

In th is paper, w e have p ro p o sed m ethods for W C E T analysis in Ja v a processor. Som e o f th em is still in d e v e lo p in g , a n d w e w ill com e to an o p tim izin g W C E T m ethod th at ap p lied for Java real-tim e e m b e d d e d sy stem s. In th e fu tu re , w e focus on m ore co m p lex Jav a b y teco d e, th at can b e im p lem en ted in Jav a processor. W e ’v e b e e n d e v e lo p in g and w ill co m plete an W C E T a n a ly sis tool, th a t su p p o rts for h ig h t-lev el as w ell as lo w -lev el W C E T analysis.

(10)

A c k n o w le d g e m e n t. T h is w ork Is partly supported by th e research p ro je c t N o . Q C .0 8 .0 5 granted by V ietn am N ational U n iversity, H anoi.

References

[1] Vu Q uang D ung, N.V.H., Real tim e garbage coileclion for Java m icroprocessor, A T C 2 0 0 8 co n fe re n c e (2008)

[2] M. Schoebcrl, Jop: A Java optim ized processor for em bedded real-tim e system s, in: P h D thesis, Vienna U niversity o f Technology (2005).

[3] J.M . O ’Connor. M. Tremblay, Picojava-i: The Java virtual m achine in hardw are, ỉn: IE E E M icro, N o 17 in 2 (1997) 45.

[4] w . Puffitsch, Picojava-ii in an fpga, ỉn: M a ster's thesis, Vienna U niversity o f Technoiogv (2 0 0 7 ).

[5] A .c . Shaw, R easoning about tim e in higher-level language softw are, ỉn: IE E E Trans. Softw. E ng.. N o 15 (7) (1989) 875.

[6] P. Puschner, A, SchedI, C om puting m axim um task execution tim es o f a graph-based ap p ro a c h . J o u r n a l o f Real-Time System s, N o 13 (July 1997) 67.

[7] Y.T.S. Li, S. M alik, P erform ance analysis o f em bedded softw are using im plicit path en u m eratio n , Ịn L C TE S '95:

P roceedings o f the A C M S ỈG P L A N Ỉ9 9 5 w orkshop on languages, com pilers, a n d tools f o r rea l-tim e system s, (1995) 8 8^.

[8] Y.-T. S. Li, S.M., A. Wolfe, Efficient m icroarchilccture m odeling and path analysis for rcal-tim c softw are, In R T S S '95: P roceedings o f (he !6 th IE E E Real-Tim e Sysiem s Sym posium (R T SS '95), page 298, W ashington, DC, USA, IEEE C om puter Society (1995)

[9] M. Schocberl, A lim e predictable Java processor, In P roceedings o f the D esign, A u to m a tio n a n d Test in Europe C on

fe r e n c e (DATE 2006), (2006) 800.

^ H a / V N U J o u r n a i o f S c i e n c e . M a t h e m a t i c s - P h v s i c s 2 " ( 2 0 Ì I ) Ị O Ị - Ỉ K)