Join or Manage Your Profile
Posting Boards
Maintenance and Reliability
Posts About Improving Reliability
Weibull..or not!|
Go
![]() |
New
![]() |
Find
![]() |
Notify
![]() |
Tools
![]() |
Reply
![]() |
|
| <Rui Assis>
|
No ofense Josh,... Regards, Rui This message has been edited. Last edited by: <Rui Assis>, |
||
|
No problem. Just want to hear the views against Weibull analysis. The book is self-explanatorily for Weibull analysis!
|
||||
|
Our company has dozens of engineers who review field failure data daily and use the Weibull distribution to model failure patterns. This gives design engineers a benchmark from which to work when designing out specific failure modes. Repeated analysis (say 6 or 12 months after a design change is implemented) gives confidence a given design change was effective.
Cheers, Matt. |
||||
|
Matt,
Good news. Can I ask what sort of industry you have and how much data you get? Do you get data from lots of sites and pool it? I used to work in aviation and when we had problems we had lots of data (unfortunately - cos most people did not like having failures). Even though the data was plentiful, it did not always fit the Weibull curve and always required us to delve deeper into the mechanisms that created the patterns. Regards Steve www.pmoptimisation.com.au |
||||
|
Gary,
Can you let us have your solution to the problem so we can assess all the suggestions?
I have lots of comments but would like to see all the answers first. Regards Steve www.pmoptimisation.com.au |
||||
|
Steve, as you may remember (since we met at ICOMS 2006 and you have my card) I work for a mining equipment OEM.
We get our data through our OEM dealers, and plenty of it, so we may be more fortunate than others. Yes, we have one global database of failure reports, so parts failing on machines in various geographical can be pooled to increase confidence in the estimation of the shape and scale factors. Like any tool, this sort of analysis needs to be applied carefully and with appropriate engineering judgement but it seems to work just fine for us. Cheers, Matt. |
||||
|
Matt,
Sorry for not making the connection. I am sure that where you have lots of data and use appropriate engineering judgement, you will get good results. But I think if you dont ask "why" the data is what it is, incorrect conclusions can be easily made. I try to fathom the wisdom of anayses done from three data points. This is my stance. Regards Steve |
||||
|
Rui,
I do have some comments. 1. First of all, you said you did not use a lot of the data because you said it was incongruent. I am sure you are not suggesting that because this data did not fit your hypothesis you did not use it. But I am curious as to why this data is not part of the analysis....would it be possible for you to give us this data as well? 2. Secondly, if you view the graph I created with your data, it looks to the layman like a strong random pattern with some sort of peak of failure between 40 and 50 days. Now for any decent conclusion to be reached, I would be trying to figure out why that is - having three times the failures in the 40 – 50 range compared to the others could be a quirk of a random distribution but it may well be a different failure mechanism... yes there could be two or more mechanisms working. If this is the case, pooling all the data into one histogram or Weibull plot will not allow segmentation and will not allow the best solution to be generated. What seems to have happened is that you have recommended an action without much regard to why the failures are occurring and in the absence of information you have sought to minimise the downtime caused by them. Now this is fine if the failures are sudden and truly random, but is not the best answer if there are two failure mechanisms - one being random and the other happening between 30 and 40 days. I take your comment that the people who may know the answers are not being helpful which is a shame…. Hence you have little to go on and are in someways forced to your statistical solution. 3. Thirdly… I think the real answer lies in doing a Root Cause Analysis which you have recommended. There must be something wrong with either the design or the way the autoclaves are being operated. The failure cause may end up having nothing at all to do with the laws of physics, but due to one operator doing the wrong thing. 4. Forthly, my suggestion would be to do the PM 21 days (assuming the PM is some kind of overhaul). I know your statistics show that 27 days is the least downtime, but no failure has occurred before 24 days and the downtime minima you have found does not consider the implications and costs of the 10% of times the failure occurs unexpectedly and the secondary costs associated with breakdown maintenance as opposed to planned maintenance. At 21 days the job will almost always be a planned job… at 27 days, around 10% will be breakdowns. I would say that your statistical analysis has found a minima of downtime, but I don’t think the problem has been solved. So there you have my comments and suggestions… I would still like to see the remaining data that was excluded…. And I would be going to the hospital management and demanding the contractors assistance in helping to solve the problem. HospitalData.xls (14 Kb, 19 downloads) Autoclave failure Pattern |
||||
|
Gary,
Regarding your question below...
I have many answers.... here is my first. Both of these failure patterns are in fact infant mortality. No amount of maintenance is going to make things better... in fact the more you try the worse things will get. If your Weibull analys tells you otherwise and you believe it, you are going to get the sack shortly. Why do I say that... well I have lots of Component A and lots of Component B on my site. I get at least 3000 hours out of each of them. You are using these parts in the wrong application. Regards Steve www.pmoptimisation.com.au |
||||
|
I have no plans personally to delve into statistics to analyse failures. In our plant, in machinery, we don't have multiple examples of similar equipment. I could see it being used in electrical or instruments where there are many similar bits of equipment.
I haven't seen a failure yet where I could imagine that statistics would have helped pinpoint the root cause, plus, there are so many other issues, from my perspective, which could be resolved and would increase efficiency etc in machinery maintenance. Also, I haven't seen a failure which could be classed as 'Random'. Maybe randomly occurring, but with a specific reason for the failure. I've had a look at some of the links suggested by other postings. They are certainly interesting and contain good information, I would say also that they are addressing the same issues which I believe should be addressed. However, these issues are so fundamental they should be part of the base case for maintenance, i.e., the correct tools, good information ( drawings, procedures etc ) a safe working environment, adequate training etc. I've had Weibull mentioned by the Reliability group, also an admittance that they didn't understand Weibull curves, but someone higher supported it, so that was good enough. This is where these tools become political and are then used for the wrong reasons and not to help increase efficiency, IMO. I'm not a Luddite. I believe that new technologies are useful and that progress is inevitable but change in itself isn't progress. There should be some level of judgment applied. (end of rant) Regards, Joe Mc Cormack |
||||
|
| <Rui Assis>
|
Hi Steve,
1. The data were simply not from a trustful source (the man used to be too creative…!). Moreover it was accompanied by comments that made me suspect it didn’t have anything to do with the condensed water trap. I used data only after a certain date which coincides with another technician being allocated to this job; 2. I entered the data in Weibull++7 from Reliasoft and didn’t actually get a good fit for the hypothesis of a mixed Weibull. But I admit that as more data are gathered, that hypothesis might become evident; 3. I much appreciated your views on this. As I said, I recommended a better surveillance and an investigation on the real cause(s) of failure in order to eliminate it(them) and drastically increase availability. I strongly suspect of the operators (nurses); 4. With regard to stats, yes I see what you mean but…it is exactly because we are so often short of data that stats become useful in trying to devise what is hiding out there. From a stats perspective, I should not consider the raw data as it is, in order to base an argument. I need richer information which I don’t have. Therefore I have to estimate what numbers are most likely missing in between the ones I actually have. It is not correct to calculate from a limited sample the same way we do when we address a continuous distribution…and time is actually a continuous distribution. I would never say a PM interval of 21 hours would assure me a zero probability of failure. You Steve say that, because you based your frequency analysis only on the 10 points available. But this is definitively not correct. If you accept the data being appropriately described by a Weibull, you will notice that the likelihood for a failure to show up until 21 hours is not zero but 0,04 instead (and 0,092 for 27 hours). This is a typical situation where stats can be of great assistance. Furthermore, data are random and for this reason, one cannot take any numbers as definitive but lying within confidence intervals. I attach an image of the Excel file I coded for my own use when addressing such cases. You can see that, despite the difference is minimal, 27 hours PM interval provides higher availability than 21. Common sense, though, will tell us to perform a sensitivity analysis (which you can see at the bottom of the spreadsheet) in order to choose from a range of acceptable values which might conform some constraints that might exist in practice. Cost numbers are not real (just an example). Thank you very much for all the view points you shared and recommendations that you found of value to me. I will be attentive. Regards, Rui This message has been edited. Last edited by: <Rui Assis>, Autoclave.JPG (139 Kb, 9 downloads) |
||
|
Rui,
Just for the record..Almost always is not zero.
Rui, The above is what I said. Below is what you you interpreted me as saying.
No problem at all... I often misinterpret things too. Thanks for you opinion - my comments are below. Steve This message has been edited. Last edited by: Steve Turner, |
||||
|
That is very good reason to discard the numbers. |
||||
|
Rui, I dont accept that this data is appropriately described by Weibull and this is my point on all these examples. Until you can confirm the failure mechanisms Weibull may be inappropriate. If you plot the failures on a histogram like I have, it seems that failures dont start at all until 21 hours then there is a random mode and possibly a mode that happens between 30 and 40 days. Weibull can be spectacularly unsuccessful when there are three or four failure mechanisms operating. Sometimes it is spectacularly unsuccessful when there are two.
I snipped the above from the website below. http://www.barringer1.com/tnwhb.htm First of all there is an admission that not all distributions can be described by Weibull... secondly my experience is that the number is far less than 85% to 90%. I think the number is lucky to be more than 50%. This is why most of the time I stick to my histogram solution and try as hard as I can to understand the data and what is driving it as opposed to putting it into a Weibull calculator and producing some inferences. Rui... your analysis could well be correct. This is just my thinking and caution to people who put their professional credibility on this kind of stuff. Regards Steve www.pmoptimisation.com This message has been edited. Last edited by: Steve Turner, |
||||
|
| <Rui Assis>
|
Steve,
With regard to the “zero or almost†issue, please forgive me for the misinterpretation. With regard to the so constantly mentioned Weibull analysis, I absolutely agree with the fact that it won’t be the best fit always. I have found it myself a few times despite I don’t have so much experience as you do. I have addressed this same subject a few posts ago in this same thread:
I will let you know of further progress regarding the autoclave issue when I get information on failure causes and collect more data. Yes I agree, it is very likely that more than one failure mode coexist. Thank you once more. Regards, Rui |
||
|
Thanks for all of the interesting postings. I got a lot of good information. I downloaded the article from Ozgipsy, which was useful for me, as an explanation of the reasons for using statistics.
There are other postings on this topic of Improving Reliability which touch on the subject of competance, skill level etc of the hands-on people in maintenance. This is, I believe, the crux. I believe in reliability more than Reliability, but it's not only the hands on people who need to keep up to speed on skill, techniques and training. The support people, including stores, supervisors, managers etc. all require to be at a competent level of skill in their own jobs. It's all about teamwork and each member pulling their weight. I'm not being disrespectful to any of the people who post here. If you are taking part in this Forum, you are interested in learning and sharing, plus this is a major scource on enlightenment to me. Regards Joe Mc Cormack |
||||
|
Joe,
I totally agree with you, we learn a lot from this forum through sharing and reading other people's post. My warm regards, Rolly Angeles Teacher |
||||
|
Hi Gary,
I see your back on line. Great to have you back on the forum. Some time ago you posted an example that I thought was going to demonstrate the value of Weibull analysis. Can you provide us with your answers?
|
||||
|
Hello Steve,
Thanks for the welcome back and the reminder of my question. The MTBF for each component is 100 hours. Component A = Infant Mortality Eta = 122 hours Beta = 0.60 For this component I would consider some RCA to determine why my component is failing – Is it fit for purpose, is it installation error, commissioning error? I would speak with the manufacturer to determine what life I should expect from this component. Component B = Wear Out Eta = 100.5 Beta = 96.27 For this component I would look at fixed time replacement. Of course for both components if the consequences of them failing gave me no cost or safety impact then maybe I would run them both to failure. One of the advantages of weibull analysis is the ability to be able to analyze small amounts of failure data and get some reasonably accurate failure forecasts without having to wait for more failures. Please see attached Weibull failure rate graphs for both components. Cheers - Gary www.globalreliability.com Component_Weibull.pdf (35 Kb, 20 downloads) Component Weibull |
||||
|
| Previous Topic | Next Topic | powered by eve community | Page 1 2 3 4 5 |
| Please Wait. Your request is being processed... |
|