n**f 发帖数: 121 | 1 I am having great trouble with using Maximum Likelihood to estimate
distribution parameters. I will appreciate if anyone can help me find out
what went wrong.
Assume that I have M iid samples of random vector V, denoted by v_1,v_2,...v
_M. Define random variable X = f(V|theta) where f is function whose closed
form is known but parameter theta is unknown.
I assume X=f(V|theta) follows a lognormal distribution logN(mu,sigma) where
mu and sigma are unknown. My purpose is to JOINTLY estimate theta, mu, and
sigma. In practice all data samples come from simulation so Lognormal
distribution can be guaranteed.
I take the following approach: let us use g(x|mu,sigma) to denote the pdf
for logN(mu,sigma). The likelihood function L is then g(f(V|theta)|mu,sigma)
. Based on MLE, we then to maximize the sum of log(g(f(v_m|theta)|mu,sigma))
of all M samples. Since f(.) and g(.) are both known, this maximization problem can be
solved by nonlinear programming (at least in theory). The optimal solution
would provide the best estimates for theta, mu, and sigma simultaneously and
decent properties such as consistency follows.
In reality, however, when I inspect the result from nonlinear programming
solver, I routinely find that the estimates for theta, mu, and sigma are far
from the true values. I use the word true value because I am using
simulation data. In particular, in many case mu and sigma would fall on
boundary of the nonlinear programming problem. This happens even with large
numbers of data samples, i.e., when M = 10000. To make things worse, I
compare the value of likelihood function calculated at theory optimum (true
value) and optimization output. More often than not the a greater likelihood
(for sample data) is achieved at the optimization output, but not the true
value.
My question is, am I using MLE in the right way? I would appreciate if
anyone can point out the above description violates assumptions about MLE.
Many many thanks! | F****n 发帖数: 3271 | 2 MLE needs a pdf. In your case, g(X) is pdf, but you are actually estimating
g(X(V)), which is a new function of V. Make sure it is also a pdf of V.
.v
where
【在 n**f 的大作中提到】 : I am having great trouble with using Maximum Likelihood to estimate : distribution parameters. I will appreciate if anyone can help me find out : what went wrong. : Assume that I have M iid samples of random vector V, denoted by v_1,v_2,...v : _M. Define random variable X = f(V|theta) where f is function whose closed : form is known but parameter theta is unknown. : I assume X=f(V|theta) follows a lognormal distribution logN(mu,sigma) where : mu and sigma are unknown. My purpose is to JOINTLY estimate theta, mu, and : sigma. In practice all data samples come from simulation so Lognormal : distribution can be guaranteed.
| n**f 发帖数: 121 | 3 Thank you very much for the reply. I guess this is the problem.
However, the distribution of V is very hard to derive. Is there anyway to
get around that? Like method of moments?
Again many thanks!
estimating
【在 F****n 的大作中提到】 : MLE needs a pdf. In your case, g(X) is pdf, but you are actually estimating : g(X(V)), which is a new function of V. Make sure it is also a pdf of V. : : .v : where
| o****o 发帖数: 8077 | 4 to me seems it falls naturally into Bayesian hierarchical framework
Have u tried Bayesian approach? Is the conditional distribution of [V_i|data, other parameters] hard to derive?
【在 n**f 的大作中提到】 : Thank you very much for the reply. I guess this is the problem. : However, the distribution of V is very hard to derive. Is there anyway to : get around that? Like method of moments? : Again many thanks! : : estimating
| n**f 发帖数: 121 | 5 V_i is the data I observe. But the close form distribution might be hard to
derive.
data, other parameters] hard to derive?
【在 o****o 的大作中提到】 : to me seems it falls naturally into Bayesian hierarchical framework : Have u tried Bayesian approach? Is the conditional distribution of [V_i|data, other parameters] hard to derive?
| F****n 发帖数: 3271 | 6 What do you mean by very hard to derive?
You cannot get the integral of g(X)?
【在 n**f 的大作中提到】 : Thank you very much for the reply. I guess this is the problem. : However, the distribution of V is very hard to derive. Is there anyway to : get around that? Like method of moments? : Again many thanks! : : estimating
| n**f 发帖数: 121 | 7 Thank you again for your reply.
g(X) is just standard LogNormal distribution. This part is straightforward.
My challenge is that: X is a scalar and V is a vector. So finding V=f'(x|
theta)
is very difficult -- at least for me. In reality, V is a K-dimensionional
vector that satisfies the following equation:
X = v_1 / sum_{i=1...K-1}{theta_k*v_k}.
One can argue that V is not identifiable with X and theta given.
【在 F****n 的大作中提到】 : What do you mean by very hard to derive? : You cannot get the integral of g(X)?
| F****n 发帖数: 3271 | 8 Are you sure you have a solution? If V is not identifiable with X and theta,
then there are potentially a lot of different Vs and thetas that can
produce the same X (quite true in your sum/aggregate X function). Even you
have a pre-defined data, there can be other parameter values that can
produce your pre-defined V.
IMO, you probably don't have an optimal solution no matter using what method.
【在 n**f 的大作中提到】 : Thank you again for your reply. : g(X) is just standard LogNormal distribution. This part is straightforward. : My challenge is that: X is a scalar and V is a vector. So finding V=f'(x| : theta) : is very difficult -- at least for me. In reality, V is a K-dimensionional : vector that satisfies the following equation: : X = v_1 / sum_{i=1...K-1}{theta_k*v_k}. : One can argue that V is not identifiable with X and theta given.
| n**f 发帖数: 121 | 9 Thank you for your patients comments. Now I know the issue is with modeling
rather than estimation.
theta,
method.
【在 F****n 的大作中提到】 : Are you sure you have a solution? If V is not identifiable with X and theta, : then there are potentially a lot of different Vs and thetas that can : produce the same X (quite true in your sum/aggregate X function). Even you : have a pre-defined data, there can be other parameter values that can : produce your pre-defined V. : IMO, you probably don't have an optimal solution no matter using what method.
|
|