Acta mathematica scientia,Series A ›› 2000, Vol. 20 ›› Issue (1): 31-35.
• Articles • Previous Articles Next Articles
Online:
Published:
Abstract:
Inthispaper,weconsiderthesamplepathoptimalityfornonstationary MDPwitharbitrarystateandactionspacesunderaveragecriterion.Bythemartingaletheory,weprovetheexistenceofoptimalMarkovpoliciesundertheweakergodicconditions,andthenextendthemainresultsobtainedbyA.Aropostathis,V.Borkar,E.F.Gaucherand,M.GhoshandS.Marcus[1](1993)
Key words: Markovdecisionprogramming (MDP), Averagesamplepathcriterion, Nonstationary, OptimalMarkovpolicies
CLC Number:
GUO Xian-Beng. The sample path optimality for nonstationary MDP with Average criterion[J].Acta mathematica scientia,Series A, 2000, 20(1): 31-35.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://121.43.60.238/sxwlxbA/EN/
http://121.43.60.238/sxwlxbA/EN/Y2000/V20/I1/31
1 ArapostathisA,BorkarV,GaucherandEF,Ghosh M,MarcusS.Discretetimecontrolled Markovprocesseswithaveragecostcriterion:asurvey.SIAMJControlandOptimization,1993,31(2):282-344 2 HindererK.Foundationsofnonstationarydynamicprogrammingwithdiscretetimeparameter.New York:SpringerVerlag,1970 3 RolandoCC,EmanuelFG.Denumerablecontrolled Markovchainswithaveragerewardcriterion:samplepathoptimality,ZORMatheMethods.OperRes,1995,41:89-108 4 LermaO.Adaptivemarkovcontrolledprocesses.New York:SpringerVerlag,1989 5 ParkY,BeanIC,SmithRL.OptimalaveragevalueconvengenceinnonhomogeneousMarkovdecisionprocesses.J MathAnalApp,1993,179:526-536 6 魏力仁,郭先平.非平稳MDP平均模型.科学通报,1991,10:728-730
Cited