Join or Manage Your Profile
Posting Boards
Maintenance and Reliability
Posts About Improving Reliability
Weibull..or not!|
Go
![]() |
New
![]() |
Find
![]() |
Notify
![]() |
Tools
![]() |
Reply
![]() |
|
[QUOTE] One of the advantages of weibull analysis is the ability to be able to analyze small amounts of failure data and get some reasonably accurate failure forecasts without having to wait for more failures [QUOTE]
Gary, How confident are you about your conclusions - if you did any hypothesis tesing, what would that tell you? I must say the graphs look very convincing but what are the chances in each case (A and B) of the next component lasting 1000 hours or 5000 hours and what would that do to your analysis? Rgs Steve |
||||
|
Hello Steve,
With the weibull datasets used I am confident of my decisions. If I include some failures that had occurred at 1000 hours or even 5000 hours in both components A and B, I would be looking at infant mortality in both components. I would then need to perform RCA on both failures. The fact I had some failures at 1000 and 5000 hours may very well indicate I have more than one failure mechanism of my component and that I had performed my analysis at a high level. I have not performed hypothesis testing but I am aware of its functionality. I am not sure what a hypothesis test would tell us with the same datasets. Cheers Gary |
||||
|
Hi Gary,
I appreciate your participation but unfortunately what you are saying proves my concerns precisely ... and that is that Wiebull is highly over rated, practically useless and in this particular case, provides information which is almost certainly wrong yet because you are using a proces people respect, you think the answer is right. As with any tool, the people using them should be properly trained - if you dont know about confidence levels and hypothesis testing then you should not be making predictions using statistical methods. The MTBF of both components is most unlikely to be 100 days as you have stated with confidence. The life and failure pattern of these components could be anything - yet you seem very confident that if you put your three data points into a the Weibull plot then you can proceed and tell someone the right answer. One of the biggest problems in our industry is the Weibull plague. Many companies are spending large sums on analyses which are wrong. I hope that, as an industry we come to our senses quickly and get back to dealing with what we know properly and not making up models that have little or no substance. Bottom line on this analysis is: You have three data points on each of two components that tell you that there is something wrong. They dont tell you much about what is wrong and certainly dont tell you what to do about it. Hope this helps the community to beware of Weibull analysis. Regards Steve |
||||
|
Hello Steve,
You should run for Government - Why? Because you can put words in other peoples mouths!! I am not here to argue the case for Weibull analysis – I’ve stated my case, its obvious you have a lack of understanding of the type of analysis being performed. What I would like to do is bring your attention to the following post from you on the 1st May 2008 under the post “Reliability past RCM”: “What makes this forum live is/are different opinions. Thanks for yours Gary. They are different to mine” Well Steve, thanks for your opinion they are different to mine!! Cheers Gary P.S. If you want some course dates from ourselves to learn about RCM and Weibull analysis please go and look at our website www.globalreliability.com.au You never know you might learn something…………….. |
||||
|
Thanks Gary.
The point is we get to discuss these things and explain each others views. Rgds Steve |
||||
|
Steve, Gary,
I have been following this thread but did not have the time to participate in it so far. Steve, I think you should take some time to understand the Weibull analysis process properly. It is fairly simple to use, fast, powerful and does lead to useful actions, contrary to your belief. The problem with it, as with every other analysis process is that we invariably hit the brick wall of poor data quality in the CMMS. That is not a Weibull issue, it is more fundamental. Even my local General Hospital in Aberdeen uses the method, which while useful to maintainers and reliability engineers, has much more widespread application. If of course you choose to discredit it, so be it. As you rightly say, we all have our views. Mine is different from yours, and we can leave it there. Gary, thanks for some excellent explanations. Regards, V.Narayan (Vee) Lead Author, 100 Years of Maintenance: Practical Lessons from Three Lifetimes, Industrial Press.NY ISBN-13: 978-0831133238 Author, Effective Maintenance Management: Risk and Reliability Strategies for Optimizing Performance, 2004, Industrial Press NY ISBN-13: 978-0831131784 |
||||
|
Hello Vee,
Thank you for your input. Cheers Gary |
||||
|
Vee,
Thanks for your input. Please continue. You have vast experience in this subject and others. I am keen to learn and would welcome some solid evidence of what Weibull can do in cases where data is scarce and unreliable. I know where there is an abundance of good data that Weibull can assist greatly in synthesis, however my major point is that in maintenance, that is rarely the case…. And in many cases I have seen, the data is collected at such a high level that the curve is unlikely to fit the Weibull curve… because if there are three dominant failure mechanisms, there are likely to be one or two humps in the failure pattern. The Weibull distribution only handles two dominant failure mechanisms. Gary stated that using Weibull, one can get meaningful results with very little data. In my professional opinion this is completely wrong. The statistical significance of three data points in creating a Weibull curve is very low indeed. Anyone turning this analysis into fact is very courageous indeed. As you and others would know, we run a maintenance analysis company. We are occasionly called in to sites where they have been trying to apply statistical methods and have got seriously lost. They end up creating data or taking it from a third party source that may have no relevance to their plant. One recent company sent the analysis back four times and were thinking the project has been a great waste of time. This annoys me greatly. If you read Gary's example on Component A and Component B, he states one of the components is random and the other is wear out. The other quite likely outcome is they are both infant mortality failures as I pointed out when I asked what would happen if the next component lasted ten times the MTBF of the three component failure data points. If the next failure lasts ten times the mean of the others, the result is the initial analysis is completely destroyed. In the aviation industry, reliabilty growth programs would never contemplate making predictions on such small data sets. In my younger days, I was involed in such programs for military aircraft. Maybe you can present your findings of the data Gary presented. Would you proclaim confidently that one failure pattern is random and the other is wear out.. and would you state that the components MTBF is 100 hours based solely on the three data points? Sadly, this analysis is indicative of what occurs in reality in industrial plants. Recently someone, defending the modelling he did on a new plant design told me that the result of his capacity model (x tonnes per year) was over 98% confidence. What he meant was, that he ran his model sufficient times to establish that 98% of the time the model showed the capacity was above x. He never thought to provide a confidence assessment of the validity of the data in the model, the model itself and the numerous assumptions the model contained. Now there is a chance that the model is fine if the data and assumptions are conservative – but in most cases, the data is “cleaned” before use hence the data that does not fit the model is ignored. To me this is a serious problem. Companies spend millions of dollars in capital on such justification. I would be grateful if you could correct me, but this is where the statistical methods get way out of hand. I know this is a long read, but rather then drip feeding the forum, I thought I would set out some of my major concerns. Regards Steve This message has been edited. Last edited by: Steve Turner, |
||||
|
Steve,
You say what would happen if the next component lasted ten times the MTBF of the three component failure data points. If the next failure lasts ten times the mean of the others, the result is the initial analysis is completely destroyed. The example Gary took up was to illustrate a point. Let us remember that we are not merely examining the numbers or stats, there are REAL degradation mechanisms behind these. The numbers come from real failures, caused by the combination of circumstances. That means you get a pattern that is not arbitrary but specific to the operating parameters, design, repair quality etc. So we cannot pick numbers at random, like 10xMTBF except to try to win an argument. Your solution to not getting good data seems to be to dump the method, if I understand you correctly. I suggest we get teh data quality up instead, but that does not affect just this form of analysis. The alternative is for maintainers to fly by the seat of their pants, using guesswork instead of of logic. You said something about modeling that really intrigues me. Your example seems to be one of bad modeling practice. I am amazed that you did not get the confidence band of the results. I think it is sloppy work, there is nothing wrong with the modeling process. Again, if your belief is that modeling and Weibull are useless, please feel free to hold those. My own experience is quite different. Regards, V.Narayan (Vee) Lead Author, 100 Years of Maintenance: Practical Lessons from Three Lifetimes, Industrial Press.NY ISBN-13: 978-0831133238 Author, Effective Maintenance Management: Risk and Reliability Strategies for Optimizing Performance, 2004, Industrial Press NY ISBN-13: 978-0831131784 |
||||
|
Vee,
You are right about the latter point re bad modelling practices. My big problem is there are far too many people out there that get sold a tool and without knowing how to use the tool, what it does and what it doesn't and they become almost fanatical about the results they can create and the willingness of management to believe them. The other sad thing is there are people out there selling these tools and dont train the users properly. The example I used was to try and illustrate the limitations of making decisions on populations of data with only three data points. Perhaps I could have used a roulette wheel / table as an example. The ball lands on the numbers 2 4 and 6. Does that mean the average of all the numbers is 4? Does that tell you the wheel has some bias to low numbers? Re your comment
Yes - there is no point using the method if you dont have good data... now if you do have good data and lots of it, then fine. I wholly agree. My point is not that the method is wrong - it is abused. Gary's example is one of the abuses. The modelling example is one of the abuses. There are plenty and I hope to use the forum to highlight these. I am not trying to discredit statistical methods - I am trying to highlight the practical difficulties of these methods in the industrial environment - in the hope that less people end up wasting thier time - or in the worse case. Spend the time and make bad decisions that are costly. Rgds Steve This message has been edited. Last edited by: Steve Turner, |
||||
|
Let me expand on the roulette wheel example:
The wheel has a distibution set by the numbers on the wheel. There are more of some numbers than others. These numbers represent component life. You dont know what the largest number is but you know that the numbers start from zero. You run the wheel three times and you get three numbers. Can you tell me: The average of all the numbers represented, The distribution of the numbers and draw me the distribution on a graph. Whether the distibution is normal or weibull or Chi-squared... or any other distribution. Rgds Steve |
||||
|
Steve,
Vee was correct when he said I was using the 3 data points to make a point. In industry we do have data deficiencies and there have been occasions when I had hoped for better data during my analysis. However, there are times when urgency is required and a sense of direction is needed, particularly in the case of safety, environmental and high cost issues. We cannot afford to wait for more failures to occur in order to support our conclusions. Steve, if you find Weibull analysis and modeling useless then simply don’t use it. Like Vee’s experience mine is quite different. Cheers Gary |
||||
|
Steve,
Roulette wheel “component failures” as requested:- Using a single zero roulette wheel with numbers from 0-36 (I know how many numbers there are and what the highest number is) the numbers drawn were: 21, 33 and 29 The average of all the numbers represented on the single zero roulette wheel = 18 The average number of those numbers drawn = 27.66 If these numbers represented a ‘component failure’ we would be looking at wear out. Eta = 30.41 Beta = 4.096 If the consequence of these ‘components’ failing is high and the components are critical I would look at some fixed time preventive maintenance or if the consequence was no impact on cost, safety and environment I would possibly run to failure. If the system was super critical I may even need to build in some redundancy. I've attached a probability density function plot, a failure rate plot and cumulative probability plot. I now rest my case. Cheers Gary Maybe we could now start a virtual roulette game within the forum Roulette_Wheel_Example.pdf (41 Kb, 15 downloads) Roulette Wheel Example |
||||
|
So when safety is at stake it is ok to take a wild guess?
|
||||
|
I have noticed that it has been pointed out that Weibull needs data. OK so it does and where does this data normally reside? Many companies set up their CMMS and may capture good or bad quality data. Those that capture good quality are using it those that are capturing bad quality dont know what to do with it.
So back to weibull, how does one configure a CMMS so that even on a new asset such data can become useful to decision making quickly? One great way is that we you design your equipment heirarhcy you create equipment classes and assign these categories against your equipment. Lets say you have 100 single stage centrifugal type pumps now in a short period of time if we may do some repairs or planned maitenance we are collecting quite some data for this class so as an initial starting pit some great weibull data. I guess my point, the CMMS is a great source of data that can be used for decision making if knowing how it needs to be configured to drive reliability engineering. This message has been edited. Last edited by: Paulius, |
||||
|
I am a little confused with this comment as surely if we have some instances recorded in our CMMS that this is no longer a wild guess? and before anyone says "well your maintenance strategy has already failed"..Inspection results are great pieces of data that again can be used in statistical analysis that help us prevent such safety failures. So again what reliability engineering is driving for us is that we then dont need to take such "wild guesses" which is exactly what we are trying to avoid. Therefore the situation is understood and the maintenance strategy adjusted accordingly. The nuclear industry cannot afford to have certain failures so statistcal techniques and relaibility engineering are very common practices with well trained engineers in their use. So does every industry need to have such highly qualified reliability engineers or experts that needs to perform such statistical analysis? My guess is not but such techniques whether they are used to their fullest extent or are able to provide continuous tweaks to their asset strategy can benefit organisations greatly over the asset life. This message has been edited. Last edited by: Paulius, |
||||
|
Ok Gary ... here is one of those situations where there are safety concerns. If this component fails - you run the risk of hurting someone badly. I want to reduce the risk of this component failing down to 1/10,000 components by removing it and replacing it with a new item. Can you calculate for me when I should change these out please. Rgds Steve |
||||
|
Hello Steve,
Firstly, I personally would not use this single component in a safety system and use fixed time replacement as my strategy. However for the purpose of your question, I decided to look at alternative strategies. Over a 1 year life this component is going to fail 314 times and give me 78.4 hours downtime resulting in an availability of 99.11% - I now have the potential to injure someone 314 times – not acceptable! I would consider redesign and also look for a component with a longer characteristic life and possibly a better availability. (I assumed a 0.25hr repair time). The single component also exceeded my site safety criticality target of 1. If I am to meet my target of less than 1 and not have any injuries as a result of this component failing, I would need to replace my single component every 2.5 hours. This would result in 3184 replacements a year of this component! This strategy would not be acceptable. If you had no alternative other than to use this component and depending what SIL (safety integrity level) you required from the safety system, you may need to consider increasing the amount of component redundancy. By increasing the level of redundancy to three more components in parallel then I have increased the availability level to 99.99% If the failures of the component are hidden then my strategy would look at some online testing, complete with warning alarm system to indicate component failure. I may then need to perform some functional alarm testing procedure. I have attached a FMECA (failure modes effects criticality analysis) and RBD (reliability block diagram) with results for my ‘roulette component’ How would you deal with this “roulette component” in PMO? – Go on give the wheel a spin! Cheers - Gary Roulette_Component_Strategy.pdf (1,054 Kb, 8 downloads) Roulette Component Strategy |
||||
|
Hi Gary,
In the absence of any direction from me, your analysis is based on hours - I intended the units of measure are years. I did not specify this so my apologies here. A component of the assumed rate of failure (27.66 hours MTBF) would never be considered for such an application. Redesign is the way to go - regardless of what method is applied. Can you provide the change out frequency to reduce the inservice failure rate of this component to 1:10,000 years if the parameters are in units of years. I will then argue my case. Re your suggestion I provide the PMO answer, I have to say we are not comparing PMO and Weibull. We are debating the validity of Weibull analysis. PMO can use Weibull as a method of frequency determination just as much as RCM can. I am questioning the ability of Weibull to generate realistic outcomes on three data points. I think you are proving to me it can't be done successfully. If you prove me wrong I will learn something very important about statistical methods. Regards Steve |
||||
|
Gary,
You may have rested your case but I believe you have just proven Steve's case. Based on three points of data, which we know has been generated by a random process, you have recommended the implementation of a fixed time PM task. If we continue to generate more data points using the roulette wheel approach we know that the Weibull Analysis will converge on Beta = 1 for a random process. However, in the meantime your plant has been wasting money performing a useless PM task because the Weibull analysis on three points of data indicated that you should. If your plant is like many others it will be a lot harder to remove the PM task than it was to implement it in the first place. Even when the subsequent data indicates that it was a bad decision in the first place. |
||||
|
| Previous Topic | Next Topic | powered by eve community | Page 1 2 3 4 5 |
| Please Wait. Your request is being processed... |
|