Path: news.uiowa.edu!news.physics.uiowa.edu!math.ohio-state.edu!cs.utexas.edu!news.eas.asu.edu!gatech!newsfeed.internetmci.com!vixen.cso.uiuc.edu!usenet
From: derek@slab.isdn.uiuc.edu (Derek Taubert)
Newsgroups: comp.sys.apple2.programmer
Subject: Re: speed in orca/c programming
Date: 8 Feb 1996 01:05:38 GMT
Organization: University of Illinois at Urbana
Lines: 121
Message-ID: <4fbi92$246@vixen.cso.uiuc.edu>
References: <4fampb$m9k@styx.uwa.edu.au>
Reply-To: taubert@uiuc.edu
NNTP-Posting-Host: slab.isdn.uiuc.edu

In article <4fampb$m9k@styx.uwa.edu.au> jturner@tartarus.uwa.edu.au  
(Joseph Turner) writes:
>    SetSolidPenPat(color);
> SetPenSize(6,1); 
> u= 0;
> for(u = 0; u<1;u+=0.08){
>   x1 = px[1] +(px[2]-px[1])*u;
> 
>  MoveTo(x1,py[1]+u*(py[2]-py[1]));
>  LineTo(x1,py[3]+(py[4]-py[3])*(1-u));
> 
>   }
>   SetPenSize(3,1); 
>   return;
>  }
>   the variables px, py are integers and go from px[0] to px[4]
> U is just an integer
> This in fact is used to 'shade' in a rectangle. Pull back is used. The 
> px[i], py[i] are just the corner points of the rectangle.

Ok, I'll put on my optimizing compiler tricks 101 hat for this one...

You could get a big increase in speed by doing an optimization that I'm  
almost positive ORCA/C doesn't do, which is loop invariant computation  
removal.

x1 = px[1] + (px[2]-px[1])*u;

px[1] and px[2] are constant with regard to the loop induction variable  
(u).  That means that for different iterations of the for loop, their  
value doesn't change.  Take the px[2]-px[1] computation out of the loop,  
and replace it with a local variable.  In fact, you can do this for  
several other statements in the loop as well:

	SetSolidPenPat(color);
	SetPenSize(6,1);
	/* u= 0;	remove this, it is redundant */
	li_1 = px[2]-px[1];
	li_2 = py[2]-py[1];
	li_3 = py[4]-py[3];
	for(u = 0; u<1;u+=0.08) {
		x1 = px[1] +(li_1)*u;

		MoveTo(x1,py[1]+u*(li_2));
		LineTo(x1,py[3]+(li_3)*(1-u));
	}
	SetPenSize(3,1); 
	return;

Now, we can do something called strength reduction.  Strength reduction  
takes a computation involving an induction variable (u) and a loop  
constant (li_1, li_2, li_3) and reduces it from a multiplication operation  
to an addition.  Multiplication on the GS is MUCH more expensive than  
addition, so this step will be a big win.

	SetSolidPenPat(color);
	SetPenSize(6,1);
	li_1 = px[2]-px[1];
	li_2 = py[2]-py[1];
	li_3 = py[4]-py[3];
	u = 0;
	sr_1 = li_1 * u;
	sr_2 = li_2 * u;
	sr_3 = li_3 * (1-u);
	for(; u<1;u+=0.08) {
		x1 = px[1] + sr_1;
		sr_1 += 0.08;
		MoveTo(x1,py[1]+sr_2);
		sr_2 += 0.08;
		LineTo(x1,py[3]+sr_3);
		sr_3 -= 0.08;	/* this one is tricky */
	}
	SetPenSize(3,1); 
	return;

Cleaning this code up a bit gives us:

	SetSolidPenPat(color);
	SetPenSize(6,1);
	u = 0;
	sr_1 = px[1] + px[2]-px[1] * u;
	sr_2 = py[1] + py[2]-py[1] * u;
	sr_3 = py[3] + py[4]-py[3] * (1-u);
	for(; u<1;u+=0.08) {
		MoveTo(sr_1, sr_2);
		sr_2 += 0.08;
		LineTo(sr_1, sr_3);
		sr_3 -= 0.08;	/* this one is tricky */
		sr_1 += 0.08;
	}
	SetPenSize(3,1); 
	return;

Notice that we've eliminated a few of the variables we used before.  In  
fact, we don't even need u anymore for the loop index, we can use sr_1.

	SetSolidPenPat(color);
	SetPenSize(6,1);
	sr_2 = py[1];
	sr_3 = py[3] + py[4]-py[3];
	for(sr_1=px[1]; sr_1<px[2]; sr_1+=0.08) {
		MoveTo(sr_1, sr_2);
		sr_2 += 0.08;
		LineTo(sr_1, sr_3);
		sr_3 -= 0.08;	/* this one is tricky */
	}
	SetPenSize(3,1); 
	return;

This should be a hell of a lot better under ORCA/C than what you started  
with, because we've reduced the type (not a single multiply nor array  
reference) and number of operations (3 additions, one compare) done inside  
the loop.  You could even replace the px[2] in the compare with a local  
variable set up before the loop.

Enjoy!

Derek Taubert                    | Ask me about TCP/IP for the Apple IIgs!
taubert@uiuc.edu                 | Really, I'm serious!
http://slab.isdn.uiuc.edu/~derek | Or check it out yourself at:
FAX:(217) 359-0010          | http://www.winternet.com/~taubert/gstcp.html
Path: news.uiowa.edu!news.physics.uiowa.edu!math.ohio-state.edu!cs.utexas.edu!news.eas.asu.edu!gatech!newsfeed.internetmci.com!vixen.cso.uiuc.edu!usenet
From: derek@slab.isdn.uiuc.edu (Derek Taubert)
Newsgroups: comp.sys.apple2.programmer
Subject: Re: speed in orca/c programming
Date: 8 Feb 1996 01:46:44 GMT
Organization: University of Illinois at Urbana
Lines: 55
Message-ID: <4fbkm4$49f@vixen.cso.uiuc.edu>
References: <4fbi92$246@vixen.cso.uiuc.edu>
Reply-To: taubert@uiuc.edu
NNTP-Posting-Host: slab.isdn.uiuc.edu


Whoops!  I screwed up the strength reduction in the last example I gave.   
It should instead be:

	SetSolidPenPat(color);
	SetPenSize(6,1);
	li_1 = px[2]-px[1];
	li_2 = py[2]-py[1];
	li_3 = py[4]-py[3];
	u = 0;
	sr_1 = li_1 * u;
	sr_2 = li_2 * u;
	sr_3 = li_3 * (1-u);
	for(; u<1;u+=0.08) {
		x1 = px[1] + sr_1;
		sr_1 += (li_1 * 0.08);	/* <- */
		MoveTo(x1,py[1]+sr_2);
		sr_2 += (li_2 * 0.08);	/* <- */
		LineTo(x1,py[3]+sr_3);
		sr_3 -= (li_3 * 0.08);	/* <- this one is tricky */
	}
	SetPenSize(3,1); 
	return;

This changes the 0.08 constants in the final result to local variables  
that can be computed before the loop:

	SetSolidPenPat(color);
	SetPenSize(6,1);
	in_1 = (px[2]-px[1]) * 0.08;
	in_2 = (py[2]-py[1]) * 0.08;
	in_3 = (py[4]-py[3]) * 0.08;
	sr_2 = py[1];
	sr_3 = py[4];	/* <- I forgot to reduce this last time */
	for(sr_1=px[1]; sr_1<px[2]; sr_1+=in_1) {
		MoveTo(sr_1, sr_2);
		sr_2 += in_2;
		LineTo(sr_1, sr_3);
		sr_3 -= in_3;	/* this one is tricky */
	}
	SetPenSize(3,1); 
	return;

This should act the same as what you started with (unless I've screwed up  
something else), but much faster.  :)

ONE LAST THING!  :)  The Line() function is probably faster than LineTo(),  
so you could transform the loop to use that instead.  And then there is  
the issue of using integers instead of reals, but I'll save that for  
another lecture...

Derek Taubert                    | Ask me about TCP/IP for the Apple IIgs!
taubert@uiuc.edu                 | Really, I'm serious!
http://slab.isdn.uiuc.edu/~derek | Or check it out yourself at:
FAX:(217) 359-0010          | http://www.winternet.com/~taubert/gstcp.html