Rex Zhu
afa31879f0
drm/amd/powerplay: refine pwm1_enable callback functions for CI.
...
Use the new enums for setting and getting the fan control mode.
Fixes problems due to previous inconsistencies between enums.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 13:36:42 -04:00
Rex Zhu
2fde9ab218
drm/amd/powerplay: refine pwm1_enable callback functions for vi.
...
Use the new enums for setting and getting the fan control mode.
Fixes problems due to previous inconsistencies between enums.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 13:36:35 -04:00
Rex Zhu
7522ffc41b
drm/amd/powerplay: refine pwm1_enable callback functions for Vega10.
...
Use the new enums for setting and getting the fan control mode.
Fixes problems due to previous inconsistencies between enums.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 13:36:28 -04:00
Rex Zhu
aad22ca436
drm/amdgpu: refine amdgpu pwm1_enable sysfs interface.
...
Make the interface consistent.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 13:36:21 -04:00
Rex Zhu
4f93f09e5c
drm/amdgpu: add amd fan ctrl mode enums.
...
Add common fan enums that can be used for both
powerplay and dpm.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 13:36:14 -04:00
Rex Zhu
ded96c7389
drm/amd/powerplay: add more smu message on Vega10.
...
Add some new SMU messages.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 13:36:06 -04:00
Chunming Zhou
30514decb2
drm/amdgpu: fix dependency issue
...
The problem is that executing the jobs in the right order doesn't give you the right result
because consecutive jobs executed on the same engine are pipelined.
In other words job B does it buffer read before job A has written it's result.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 13:23:53 -04:00
Chunming Zhou
cb3696fdec
drm/amd: fix init order of sched job
...
Need to increment after the fence check.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com >
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 13:16:50 -04:00
Alex Deucher
09062ae1bb
drm/amdgpu: add some additional vega10 pci ids
...
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 11:25:10 -04:00
Alex Deucher
bfc181af3b
drm/amdgpu/soc15: use atomfirmware for setting bios scratch for reset
...
Need to use the atomfirmware interface rather than atombios since
soc15 is atomfirmware based.
Reviewed-by: Chunming Zhou <david1.zhou@amd.com >
Acked-by: Harry Wentland <harry.wentland@amd.com >
Acked-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 11:23:28 -04:00
Alex Deucher
de70c6357b
drm/amdgpu/atomfirmware: add function to update engine hang status
...
Update the scratch reg for when the engine is hung.
Reviewed-by: Chunming Zhou <david1.zhou@amd.com >
Acked-by: Harry Wentland <harry.wentland@amd.com >
Acked-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 11:23:17 -04:00
Julien Isorce
634b6a8a06
drm/radeon: only warn once in radeon_ttm_bo_destroy if va list not empty
...
Encountered a dozen of exact same backtraces when mesa's
pb_cache_release_all_buffers is called after that a gpu reset failed.
v2: Remove superfluous error message added in v1.
bug: https://bugs.freedesktop.org/show_bug.cgi?id=96271
Signed-off-by: Julien Isorce <jisorce@oblong.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-10 11:23:08 -04:00
Pixel Ding
db2c2a9798
drm/amdgpu: fix mutex list null pointer reference
...
Fix NULL pointer reference.
Signed-off-by: Pixel Ding <Pixel.Ding@amd.com >
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:15:21 -04:00
Rex Zhu
7b52db39a4
drm/amd/powerplay: fix bug sclk/mclk level can't be set on vega10.
...
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:15:08 -04:00
Rex Zhu
5784d5cca6
drm/amd/powerplay: Setup sw CTF to allow graceful exit when temperature exceeds maximum.
...
cherry-pick from amd windows driver.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:14:56 -04:00
Rex Zhu
652bd0c344
drm/amd/powerplay: delete dead code in powerplay.
...
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:14:44 -04:00
Guenter Roeck
af8baf1518
drm/amdgpu: Use less generic enum definitions
...
alpha:allmodconfig fails to build as follows.
drivers/gpu/drm/amd/amdgpu/amdgpu.h:1006:2: error:
expected identifier before '(' token
drivers/gpu/drm/amd/amdgpu/amdgpu.h:1011:28: error:
'NGG_BUF_MAX' undeclared here
The problem is not really the enum definition of NGG_BUF_MAX but PARAM,
which happens to be defined differently for alpha and a couple of other
architectures.
Use less generic defines for NGG enums to solve the problem.
Fixes: bce23e00f3 ("drm/amdgpu: add NGG parameters")
Cc: Christian König <christian.koenig@amd.com >
Cc: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Guenter Roeck <linux@roeck-us.net >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:14:32 -04:00
Alex Deucher
ad7d0ff3e7
drm/amdgpu/gfx9: derive tile pipes from golden settings
...
rather than hardcoding it.
Acked-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:14:16 -04:00
Alex Deucher
f47b77b4e4
drm/amdgpu/gfx: drop max_gs_waves_per_vgt
...
We already have this info: max_gs_threads. Drop the duplicate.
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:14:04 -04:00
Rex Zhu
327fce0c0d
drm/amd/powerplay: disable engine spread spectrum feature on Vega10.
...
Vega10 atomfirmware do not have ASIC_InternalSS_Info table
so disable this feature by default in driver.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Ken Wang <Qingqing.wang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:13:28 -04:00
Rex Zhu
b4a33e325c
drm/amd/powerplay: clean up code in vega10_smumgr.c
...
1. fix typo in print message info.
2. fix block comments's coding style.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:13:13 -04:00
Monk Liu
236763d340
drm/amdgpu:fix waiting on dirty fence
...
if bo->shadow is NULL (race issue:BO shadow was just released
and gpu-reset kick in but BO hasn't yet) recover_vram_from_shadow
won't set @next, so the following "fence=next"
will wrongly use a fence pointer which may already dirty.
fixing it by set next to NULL prior to recover_vram_from_shadow
Signed-off-by: Monk Liu <Monk.Liu@amd.com >
Reviewed-by: Chunming Zhou<david1.zhou@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:12:42 -04:00
Monk Liu
1d1a2cd58f
drm/amdgpu:PTE flag should be 64 bit width
...
otherwise we'll lost the high 32 bit for pte, which lead
to incorrect MTYPE for vega10.
Signed-off-by: Monk Liu <Monk.Liu@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:12:28 -04:00
Rex Zhu
0dca704751
drm/amd/powerplay: correct LoadLineResistance value in pptable.
...
this value is used by avfs to adjust inversion voltage.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:12:08 -04:00
Rex Zhu
b7a1f0e3cc
drm/amd/powerplay: Allow duplicate enteries in pptable.
...
This is a valid configuration.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:12:00 -04:00
Rex Zhu
56a2f08c41
drm/amd/powerplay: set fan target temperature by msg on vega10.
...
SMU not support FanTargetTemperature in pptable,
so send msg instand.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:11:52 -04:00
Rex Zhu
05ee321511
drm/amd/powerplay: set soc floor voltage on boot on vega10.
...
Send the VBIOS bootup VDDC as a SOC floor voltage to SMU
before populating the PPTABLE. After DPM is enabled, This
floor voltage will be removed. This will prevent SMC from
going to Vmin upon receiving PPTable causing a violation.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:11:44 -04:00
Rex Zhu
ef181f268e
drm/amd/powerplay: refine code in vega10_smumgr.c
...
1. return error code instand of -1.
2. print msg info if send msg failed
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-05 18:11:35 -04:00
Shaoyun Liu
43b9176faa
drm/amdgpu: Reserve 0-2 invalidation reg sets for none-amdgpu usages
...
Firmware used reg set 2 for tlb invalidation. AMDGPU can start from reg
set 3 to avoid the conflict. AMDKFD will use the reg set 0 or 1 when
necesary.
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com >
Reviewws-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-02 13:18:03 -04:00
Alex Deucher
fca4ce697f
drm/amdgpu/gfx9: add additional MQD initialization
...
Need to properly set the ROQ space setting.
Reviewed-by: monk liu <monk.liu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-02 13:16:07 -04:00
Alex Deucher
0274a9c556
drm/amdgpu/gfx9: fix typo in mpd init
...
Using the wrong macro for soc15 register access.
Reviewed-by: monk liu <monk.liu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-02 13:15:39 -04:00
Alex Deucher
42ce22439f
drm/amdgpu/gfx9: use actual gpu num se setting for ngg allocation
...
Rather than using a hardcoded value.
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-02 13:15:25 -04:00
Alex Deucher
80112bffb0
drm/amdgpu: update revision id settings for BR/ST
...
Add new RIDs.
Acked-by: Christian König <christian.koenig@amd.com >
Acked-by: Alex Xie <AlexBin.Xie@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-02 13:15:05 -04:00
Michel Dänzer
98da65d5e3
Revert "drm/amdgpu: Refactor flip into prepare submit and submit. (v3)"
...
This reverts commit cb341a319f .
The purpose of the refactor was for amdgpu_crtc_prepare/submit_flip to
be used by the DC code, but that's no longer the case.
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-05-01 11:21:42 -04:00
Michel Dänzer
c81a1a7403
drm/amdgpu: Make amdgpu_bo_reserve use uninterruptible waits for cleanup
...
Some of these paths probably cannot be interrupted by a signal anyway.
Those that can would fail to clean up things if they actually got
interrupted.
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:16 -04:00
Rex Zhu
8b9242eddd
drm/amd/powerplay: implement stop dpm task for vega10.
...
Add functions to disable dpm for S3/S4.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Huang Rui <ray.huang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:16 -04:00
Rex Zhu
f8dc9476d9
drm/amd/powerplay: complete disable_smc_firmware_ctf_tasks.
...
Disable ctf in eventmgr to fix S3/S4 support.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Huang Rui <ray.huang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:15 -04:00
Rex Zhu
1dfc41d44c
drm/amd/powerplay: add disable_smc_ctf callback in hwmgr.
...
export disablesmcctf to eventmgr.
need to disable temperature alert when s3/s4.
otherwise, when resume back,enable temperature
alert will fail.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Huang Rui <ray.huang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:15 -04:00
Chunming Zhou
10e709cb29
drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2
...
the case could happen when gpu reset:
1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation.
2. gpu_reset routine will also need pd reservation to restore page table from their shadow.
3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation.
v2: handle amdgpu_cs_submit error path.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com >
Reviewed-by: Monk Liu <monk.liu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:14 -04:00
Junwei Zhang
44eb8c1b33
drm/amdgpu: bump version for exporting gpu info for gfx9
...
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:13 -04:00
Junwei Zhang
408bfe7c3c
drm/amdgpu: export more gpu info for gfx9
...
v2: 64-bit aligned for gpu info
v3: squash in wave_front_fix
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com >
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Qiang Yu <Qiang.Yu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:13 -04:00
Christian König
2c55b16bf0
drm/amdgpu: remove unused and mostly unimplemented CGS functions v2
...
Those functions are all unused and some not even implemented.
v2: keep cgs_get_pci_resource, it is used by the ACP driver.
Signed-off-by: Christian König <christian.koenig@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:12 -04:00
Rex Zhu
00c4855ef8
drm/amd/powerplay: refine set pcie dpm default table on vega10.
...
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:12 -04:00
Rex Zhu
97782cc93f
drm/amd/powerplay: disable cks by default on vega10.
...
run gpu test auto reboot when enable cks right now.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:11 -04:00
Rex Zhu
effa290caa
drm/amd/powerplay: correct UlvOffsetVid on Vega10.
...
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:10 -04:00
Alex Xie
12d39245f6
drm/amdgpu: Fix use of interruptible waiting
...
There is no good mechanism to handle the corresponding error.
When signal interrupt happens, unpin is not called.
As a result, inside AMDGPU, the statistic of pin size will be wrong.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:10 -04:00
Alex Xie
cca7ecb32b
drm/amdgpu: Fix use of interruptible waiting
...
Either in cgs functions or for callers of cgs functions:
1. The signal interrupt can affect the expected behaviour
2. There is no good mechanism to handle the corresponding error
3. There is no chance of deadlock in these single BO waiting
4. There is no clear benefit for interruptible waiting
5. Future caller of these functions might have same issue.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:09 -04:00
Chunming Zhou
a6bef67e2a
drm/amdgpu: fix NULL pointer error
...
[ 141.420491] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[ 141.420532] IP: [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[ 141.420563] PGD 20a030067
[ 141.420575] PUD 2088ca067
[ 141.420587] PMD 0
[ 141.420599] Oops: 0000 [#1 ] SMP
[ 141.420612] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) snd_hda_codec_realtek(E) video(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) joydev(E) snd_hda_codec(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_hda_core(E) snd_hwdep(E) snd_rawmidi(E) snd_pcm(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd_seq(E) crc32_pclmul(E) ghash_clmulni_intel(E) snd_seq_device(E) snd_timer(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd(E) soundcore(E) serio_raw(E) shpchp(E) i2c_piix4(E) i2c_designware_platform(E) 8250_dw(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E)
[ 141.420948] nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) r8169(E) ahci(E) mii(E) libahci(E) wmi(E)
[ 141.421042] CPU: 14 PID: 223 Comm: kworker/14:2 Tainted: G OE 4.9.0-custom #4
[ 141.421074] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017
[ 141.421146] Workqueue: events amd_sched_job_timedout [amdgpu]
[ 141.421169] task: ffff88020b03ba80 task.stack: ffffc900016f4000
[ 141.421193] RIP: 0010:[<ffffffff81579ee1>] [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[ 141.421229] RSP: 0018:ffffc900016f7d30 EFLAGS: 00010202
[ 141.421250] RAX: ffff8801c049fc00 RBX: ffff8801d4d8dc00 RCX: 0000000000000000
[ 141.421278] RDX: 0000000000000001 RSI: ffff8801c049fcc0 RDI: 0000000000000000
[ 141.421307] RBP: ffffc900016f7d48 R08: 0000000000000000 R09: 0000000000000000
[ 141.421334] R10: 00000020ed512a30 R11: 0000000000000001 R12: 0000000000000000
[ 141.421362] R13: ffff880209ba4ba0 R14: ffff880209ba4c58 R15: ffff8801c055cc60
[ 141.421390] FS: 0000000000000000(0000) GS:ffff88021ef80000(0000) knlGS:0000000000000000
[ 141.421421] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 141.421443] CR2: 0000000000000030 CR3: 000000020b554000 CR4: 00000000003406e0
[ 141.421471] Stack:
[ 141.421480] ffff8801d4d8dc00 ffff880209ba4c48 ffff880209ba4ba0 ffffc900016f7d78
[ 141.421513] ffffffffa0697920 ffff880209ba0000 0000000000000000 ffff880209ba2770
[ 141.421549] ffff880209ba4b08 ffffc900016f7df0 ffffffffa05ce2ae ffffffffa0509eb7
[ 141.421583] Call Trace:
[ 141.421628] [<ffffffffa0697920>] amd_sched_hw_job_reset+0x50/0xb0 [amdgpu]
[ 141.421676] [<ffffffffa05ce2ae>] amdgpu_gpu_reset+0x8e/0x690 [amdgpu]
[ 141.421712] [<ffffffffa0509eb7>] ? drm_printk+0x97/0xa0 [drm]
[ 141.421770] [<ffffffffa0698156>] amdgpu_job_timedout+0x46/0x50 [amdgpu]
[ 141.421829] [<ffffffffa0696a07>] amd_sched_job_timedout+0x17/0x20 [amdgpu]
[ 141.421859] [<ffffffff81095493>] process_one_work+0x153/0x3f0
[ 141.421884] [<ffffffff81095c5b>] worker_thread+0x12b/0x4b0
[ 141.421907] [<ffffffff81095b30>] ? rescuer_thread+0x350/0x350
[ 141.421931] [<ffffffff8109b423>] kthread+0xd3/0xf0
[ 141.421951] [<ffffffff8109b350>] ? kthread_park+0x60/0x60
[ 141.421975] [<ffffffff817e1ee5>] ret_from_fork+0x25/0x30
[ 141.421996] Code: ac 81 e8 a3 1f b0 ff 48 c7 c0 ea ff ff ff e9 48 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 <48> 8b 7f 30 48 89 f3 e8 73 7c 26 00 48 8b 13 48 39 d3 41 0f 95
[ 141.422156] RIP [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60
[ 141.422183] RSP <ffffc900016f7d30>
[ 141.422197] CR2: 0000000000000030
[ 141.433483] ---[ end trace bc0949bf7ddd6d4b ]---
if the job is reset twice, then the parent could be NULL.
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:09 -04:00
Roger.He
8252131639
drm/amdgpu: validate shadow before restoring from it
...
Reviewed-by: Chunming Zhou <david1.zhou@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Roger.He <Hongbo.He@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:08 -04:00
Alex Xie
4a9ed1009b
drm/amdgpu: Fix use of interruptible waiting
...
1. The signal interrupt can affect the expected behaviour.
2. There is no good mechanism to handle the corresponding error.
When signal interrupt happens, unpin is not called.
As a result, inside AMDGPU, the statistic of pin size will be wrong.
Signed-off-by: Alex Xie <AlexBin.Xie@amd.com >
Reviewed-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2017-04-28 17:33:07 -04:00