Ken -
This is the list of architectural investigations people are doing
and the status (as of last updte; have not yet updated it today)
that I use to report to Pat and Richard what is going on.

Since you have the architecture document, these are the "area" investigations
that are going on (when firedrills over CLDs are not being handled instead).

Group members are taking these topics very broadly, on the whole, which
is just what is needed; the issues that cross topics HAVE to be
covered (so that flow control, for instance, interacts with reset
handling properly). What I have to synchronize here is that the importrant
results get passed to the folks whose areas are most impacted by them
and that the final document describes a design that can work and that
can be incrementally implemented.

I'm working on a "phase 2" document which will incorporate the added
detail and harmonize it, and leave space for major component routines
to be added.

Mind, it is possible to take even what is in existence so far and treat
it as a yardstick against which an implementation is to be measured,
but this will leave many issues unsettled. That is the rationale for
goint to major routine interface specification level.

The goals are primarily to make the SCSI code base more maintainable,
with secondary goals of improving performance (some adapters, e.g.
PKJ's adapter, receive a 25% drop in throughput with the queue manager
as implemented in Zeta) and facilitating the adding of new capability.
The further goal is to get the new design in time to get some of the first
pieces into Zeta (which implies interoperation with much of the existing
code base).

Non goals are things like implementing CAM, putting complete design into
the architecture, or spelling out application in the architecture of every
bit of SCSI.

The resulting document is intended to be a benchmark and rationale for high
level decisions on how SCSI is to be implemented which will allow a design
to be gauged for compliance and which will suggest LOP type investigations
about what's involved in moving to it in different parts of the subsystem.
I have been concerned since day 1 (well, maybe day 2 or 3, but before I
had read enough about SCSI to do much) about how the migration is to take
place. The rest of the group has not been required to concern itself with
the constraints so much, because we wanted to get their thinking in blue
sky mode about top level organizational and functional issues.

The fact that some of this represents "blue sky" thinking (and had to at
its earliest stages) may have been why LOP was not followed. I have been
trying to keep people informed, though the recent scheduling issues
have been a change, and I tend to want to deal with the immediate problems
as well as the longer term ones, though the technical direction we have
all been given to date has been to defer new functionality for the time
being. Considering how easy it is to have all your time sucked up by
firedrills, though, I'm beginning to see why the direction has come down
so hard...

Glenn Everhart


Folks -

The following assumes some familiarity with the SCSI architecture
document. These are topics which need to be further investigated to
fill in details in the architecture. This is a design activity and
will require that you think about issues of performance, maintainability,
usefulness to customers, etc.

The topics here may in some cases need only a couple days, or in other
cases maybe a couple weeks to investigate. Please think about the topics
and how long you expect to be needed for them. The investigations need to
produce more detailed text to go into the architecture at the next level
down in details. My hope is that most of these will be relatively short
since there is plenty of additional detail to go beyond this.

We will parcel these out at or about the Thursday meeting; the idea of
passing the list around is that people might have favorite topics or want
clarification or further definition. The time before then will give you some
chance to seek such.

More details of the interfaces at various levels of SCSI and the interfaces
for common routines will need to be worked out, but these questions need
to come first.

The times needed will be negotiated after selections.

glenn
-----------------------------
SCSI architectural issues needing investigation reports


1. Support of SDTR/WDTR and LUNs. SDTR/WDTR are ID wide, but one gets
inquiry data per LUN. How should SDTR be treated (pref. for smart adapters)
in terms of the various enabled bits? Ditto WDTR. Force all to be the same?
Switch on reselect? Disallow certain configurations? what? The question is
what operations should be performed, when, to handle these negotiations
correctly and in a fashion to support most devices. (There are comments in
existing code about SDTR and these issues which will help.)
   My guess: week
Sue. 1wk


2. The architecture document proposes a super-SCDRP containing command
buffers as well as state information and possibly custom packing calls to
port drivers to handle very unique packing (i.e., not just copy SCSI command
into buffer) as well as some means of telling where the CMD buffer should
be located. Are there any hidden issues with doing this that would act
as problems? Any reasons such a proposal might negatively impact function
or performance?
   My guess: 3 wks
Sue, Glenn. 3 wks

3. How should SCSI data structures be linked together? (Remember SCSI3
is likely to mean larger IDs, LUN numbers, and maybe wider constants.) Moving
from one data structure to another is frequent and we need to be sure
a scheme in the architecture can handle growth and be efficient.
   My guess: week
Jim. by 9/22

4. Is a single level selector sufficient for matching device "SCSI IQ"
or peculiarities within class level? Or are there examples where additional
capabilities lists should be maintained per device? Suggest forms for these
to take if needed. Can one create a single number (or a single number per VMS
function) as suggested in the architecture to be a valid representation of
a SCSI IQ, or must more dimensions be used? If so, what? (Can a SCSI device
be an idiot savant?)
   My guess: 2 weeks
Rick. Start in 1wk on all 3

5. What parts of flow control can be reasonably handled at class startio
and what needs to be done at port level? Would it be more advantageous to
just have flow control all handled at port level? The object is to avoid
resource exhaustion and strive for some I/O fairness. This involves questions
of whether the class busy bits are adequate, how should queue depth and queue
full status interact, and at what level, and whether one should (try to) use
mode pages to tell how full a TCQ queue is and adapt to the hardware. How to
handle the switch to single command vs. TCQ mode is an issue too. (Should
one issue bus device reset or some such to stop long operations?) Also an
issue: multi-initiator busses. If one can setmode-control quotas one might
adapt total quotas so the queues would retain room even though >1 initiator
is using the bus.  (Since the queue manager is to be part of only those port
drivers for "dumb" ports, the current flow control scheme resident in it
needs to be revisited.)
   My guess: 2-3 wks
Jim. 1+ wks

6. What do SCSI control chips supply by way of bus quality metrics? Is there
any common information that can be captured and made available to users in
some fashion about this, or are the control chips so different that
basically no common information is conceivable? Bus quality metrics are
desirable for diagnosis of field problems, possibly for field tuning of
parts of SCSI, and for determining when path failover might be needed...IF
it is feasible to obtain any such thing in a reasonable way.
   My guess: 2 wks
Rick. Start in 1wk on all 3


7. At what level can RESET be handled? Is it possible to move such handling
all (or mostly: some code would be common with packack handling) down to
the top level port code?
   In general can more specific rules of thumb be given about which errors
should be handled at low level vs. handling in class code? The architecture
document proposes a rule that states you handle errors in class code unless
knowledge of the error condition is complete at lower levels. Can we be
more specific about various types of errors?
   The current use of mount verify to respond to SCSI bus reset makes its
use for path failover difficult and imposes performance penalties which have
no business being present; handling RESET should be done somewhere within
the SCSI subsystem envelope. Since it is a bus-wide condition it would seem
logical to do so in the bus level code (i.e., in the port driver.) Question
is, is this feasible, and what issues arise from doing such?
   My guess: 2-3 weeks
Buzzy. 3 wks.

8. What SCSI knobs and switches should be made controllable for starters?
(There's a good deal in the documents about things to control, but also
how does one set profiles?)
   My guess: week+
Rick. Start in 1wk on all 3

9. What is needed for a driver disconnect capability? (How should one idle a
device? What about long I/O operations? ) (Involves disconnect, reconnect,
and possibly driver unload as a further option.)
   My guess: week+
Grace. 1.5 wks

10. How should VMS locate port drivers and initialize SCSI subsystems?
   My guess: 1-2 wks
Grace. 1.5 wks

11. Is there a better way to pass I/O to port level than the current
svapte/bcnt/boff one? What general memory management routines are needed
to translate addresses?
   My guess: 2 weeks
Buzzy. 4 days

12. How should AEN be handled? (Target mode too.) Best to put any of
it into port code (beyond the interrupt recognition)? Should a new class
driver be present?
  My guess: 2 weeks ++
Tom. 2 wks (actually 1wk but busy with Japanese next week after)

13. Time-out. Can anything more general be done than having the timeout
set by class level function dispatch? Adapt to devices perhaps? Again how
might one store and init a profile in clusters?
  My guess: 2 weeks+
Jim. 2wks +

14. When should errors be logged and in what form? Can some generic rules
of thumb (testable!) be given more than are in the current document for
this? [rnote: ring buffers etc.]  
Mary. 3.5 wks.


Statuses, 9/19/95

Sue S. - Sent me 1st draft of SDTR info
Jim D. - 240 lines written so far (5-6 pages!) on flow control
Rick - Needs adapter documents. Leaning toward using diagnostic
`	page SCSI commands to do bus metrics.
Buzzy - Reviewing use of mem mgt. and is finding map buffers straight-
	forward. Designing interface for code to build scatter/gather
	lists.
Grace - 1 page written so far re research on locating port driver. (I
	had discussions with her after mtg & referred her to Sue to
	discuss some common issues betw. autoconfig. and SCSI connection
	setup.)
Tom - Starting to write up AEN & target mode. Suggests a followon
	study of whether anything in SCSI 3 will invalidate the
	target mode implementation we have. (Group discussion was that
	target mode is what we have, not really AEN.)
Mary - Looked over port drivers. Looking at class driver error reporting
	now. Finding lots of inconsistency. (In discussions with her
	last evening I told her she's finding exactly the kind of 
	inconsistency we need to remove which I gather & hope helped
	her get it clear what we need.)


----------------------
It is mentioned that a means to allow a port driver to delay a very
short time & be recalled is needed. No queue mgr in scsi2common means
this will be needed. Mention to Jim.

Statuses 9/21/95

Sue S - still writing. SDTR doc nearly done. Looking at sources re
	data structures
Jim D - will send me something. However, interruption rate and scsi
	retrospective interfere; may need to offload some info.
Rick L - going OK. Skeletons entered in note file
Buzzy R - Going well. Writeup on mem mgt in note file. Thinking about
	reset.
Grace W - still investigating. Has enough basic data.
Tom G - Hope to have some writing done by Friday. Nikon CLD a major
	distraction.
Mary Y - Looked over class drivers now. Thinking about the issues.

Marge S - putting more comments into port driver book; new draft in
	a few days.

---------------------------------
Statuses 9/26/1995

Grace - done one study; doing the second.
Marge Sherwood - making some progress on the port driver book, though
	that's #3 on her priority list now.
Jim Dunham - Updated flow control text some in response to my
	handwritten notes asking for more normative (as opposed
	to descriptive of existing code) text. Jim is more willing
	to discuss such needs verbally than is put down on paper
	here. However Jim announced he's taking a job in the cluster
	I/O group. I asked Rick Lord to check over Jim's text and
	need to get back for discussions.
Sue S. - SDTR writeup done; still studying the issues in condensing
	more port driver inputs into a single call & data structure.
Rick L - Done his 3 writeups. Has worked on a SCSI mode program
	which needs to get checked into Ghost somehow; may need a
	review. I will look it over with him.
Buzzy R. - reviewing drivers for reset handling, but has been
	involved with qlogic rathole (and has the code in hand).
Tom G. - Japanese board arrived along with Tom Y., from DEC Japan,
	but the board did not work. Attempting to get another rush
	shipped in from Japan.

I told the group that some CLDs will be coming soon (some possibly
as early as tomorrow) and that when their current studies are done
we need to cross review.

In addition Dave Fairbanks needs a code review of PKSdriver code
after about 10/10; this code supports scsi clusters, for Ghost.

I have a slight extension of Sue's pkcdriver fix that permits
disabling SDTR to any device on SCSI busses A-D running on my
workstation. Sue has a copy of the code, in case it might be
useful to let selected sites get around SDTR problem devices.