Sunday, February 11, 2007

Mistake proofing...

Mistake proofing is when you take precautions to ensure that there is no possibility of error occurring. In England, it was found that people mistook between petrol and diesel fillers while fueling their cars at gas stations. (They have to fill it by themselves) . Manned gas stations did not solve the problem completely because there was always the possibility of human failure. Ultimately the car manufacturers fitted the petrol cars with positive polarity fuel tank mouth, and diesel cars with negative polarity. The gas stations had exactly the opposite polarity - petrol dispensers were fitted with negative polarity, and diesels dispensers with positive polarity. (Like charges repel, unlike attract!!!).

DPMO View

What is the need to view defects "per million", can % view not suffice?

Defects per million gives a better insight into defect severity. This magnifies defects and makes the performance look better (if the process is efficient). For example, you can talk that your process has only 1% defects. When talking in terms of percentage, the severity of defects does not seem much. Defects look manageable!!!

Parts per million magnifies the defects and shows a more realistic picture. When translated to PPM, 1% is equal to 10, 000 defects per million! Now this seems too big a value to neglect.

Instead of PPM, a better way to denote variation is by calling points as defects. Thus, we have DPMO (Defects per million opportunities) instead of PPM.

Same mean, diff std dev curves...

This image shows curves having same mean, but different standard deviations. Mean is the line where most of the values will be crowded. The curve with the highest peak is running the most efficient processes.

As the incline of the curve increases, the area that falls outside the specification limit is less. This means the number of defects is lesser as the peak gets taller. In the curve with the highest peak, less issues need to be addressed compared to the peak with least height.
Higher the value preceding the simga sign, the lower is the possibility of occurrence of defects. Thus, 8Sigma process has lower possibility than 6Sigma, which in turn has lower possibility of defects than a 4Sigma process.

Friday, February 09, 2007

Data Types

There are two data types: 1. Attribute 2. Continuous. Attribute data has countable quality characteristics for example, number of defects, Number of defectives, Number of NCs, etc. Continuous data on the other hand has measurable quality characteristics. For example, length of a spark plug, weight of a spark plug, temperature at which the spark plug has maximum efficiency, etc.

If a software project just collects data on whether each milestone is met or not met, it is collecting attribute data. This does not tell us whether we have overshot or under met the expectations.

Another example that shows the difference between attribute and continuous data: In a glass (drinking water glass) manufacturing industry, there are two teams, which assure that length of the glass is of stipulated length. The first team uses Vernier Calipers to measure the length. If the glass is of stipulated length, it passes the quality check, otherwise not. This type where the length of glass is MEASURED, is called Continuous data. The second team uses the go-noGO gauge technique. Here, the glass is allowed to pass through two separate raised platforms. The first platform has allows glass of stipulated length, while the second one allows only shorter. The inspection items are passes thru both the platforms one after another. If any glass passes thru both of them, then it is of shorter length than desired. If it does not pass thru any of the two platforms, it is of longer length. This way of gauging relies on ATTRIBUTE data, because the team checks for Yes/No condition for each glass.

Attribute data does not need costly implements. In our example, the second method is far cheaper than engaging vernier calipers, but we lose a lot of detail.

Note: Difference between defects, and defectives. “Defects” is the total number of defects in all the pieces inspected. “Defectives” is the count of items which have defects. For example, in a water glass manufacturing industry, in a lot of 100, these defects were found in one inspected item: the length of the glass is improper, has cracks. In another inspected item these defects were found: shape was malformed.

So, out of 100, the defectives here are 2 glasses, while the defects are three (for glass 1, length and cracks, and glass 2 the malformed shape). “Defects” therefore is always a better representative of the abnormalities / deviations, than “Defectives”.

Note that continuous data can be converted to attribute data, but vice versa. So, it is always better to go for continuous data if there is a possibility to measure it.

Quartile Deviations – Sample & Analysis

This table represents the marks scored in math exam by students in XII-A, XII-B, and XII-C sections. The Q4, Q3, Q2, Q1, and Q0 are the quartiles. In lay terms, the quartiles, divide the range of marks into 4 sections.


Q0-Q1 is the first quartile
Q1-Q2 is the second quartile
Q2-Q3 is the third quartile
Q3-Q4 is the fourth quartile


For XII-A, there is not much variation between Q3 and Q4. This means, there is little variation among the top performers of the class. Q3 and Q2 show some variation.

Visualizing Next Word Prediction - How to LLMs Work?

 https://bbycroft.net/llm