Commit Graph

668 Commits

Author SHA1 Message Date
Yefim Barashkin
8c37e7a200 drm/amd/pm: Prevent divide by zero
[ Upstream commit 0638c98c17aa12fe914459c82cd178247e21fb2b ]

divide error: 0000 [#1] SMP PTI
CPU: 3 PID: 78925 Comm: tee Not tainted 5.15.50-1-lts #1
Hardware name: MSI MS-7A59/Z270 SLI PLUS (MS-7A59), BIOS 1.90 01/30/2018
RIP: 0010:smu_v11_0_set_fan_speed_rpm+0x11/0x110 [amdgpu]

Speed is user-configurable through a file.
I accidentally set it to zero, and the driver crashed.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Yefim Barashkin <mr.b34r@kolabnow.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-21 21:24:29 +02:00
Mario Limonciello
0a9a60dced drm/amd: Refactor amdgpu_aspm to be evaluated per device
[ Upstream commit 0ab5d711ec74d9e60673900974806b7688857947 ]

Evaluating `pcie_aspm_enabled` as part of driver probe has the implication
that if one PCIe bridge with an AMD GPU connected doesn't support ASPM
then none of them do.  This is an invalid assumption as the PCIe core will
configure ASPM for individual PCIe bridges.

Create a new helper function that can be called by individual dGPUs to
react to the `amdgpu_aspm` module parameter without having negative results
for other dGPUs on the PCIe bus.

Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-07-12 16:35:06 +02:00
Yury Norov
b2d359f095 drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate
[ Upstream commit 525d6515604eb1373ce5e6372a6b6640953b2d6a ]

The smu_v1X_0_set_allowed_mask() uses bitmap_copy() to convert
bitmap to 32-bit array. This may be wrong due to endiannes issues.
Fix it by switching to bitmap_{from,to}_arr32.

CC: Alexander Gordeev <agordeev@linux.ibm.com>
CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Christian Borntraeger <borntraeger@linux.ibm.com>
CC: Claudio Imbrenda <imbrenda@linux.ibm.com>
CC: David Hildenbrand <david@redhat.com>
CC: Heiko Carstens <hca@linux.ibm.com>
CC: Janosch Frank <frankja@linux.ibm.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: Sven Schnelle <svens@linux.ibm.com>
CC: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14 18:36:24 +02:00
Lijo Lazar
e0199ce728 drm/amd/pm: Fix missing thermal throttler status
[ Upstream commit b0f4d663fce6a4232d3c20ce820f919111b1c60b ]

On aldebaran, when thermal throttling happens due to excessive GPU
temperature, the reason for throttling event is missed in warning
message. This patch fixes it.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14 18:36:23 +02:00
Sathishkumar S
5005002b2e drm/amd/pm: update smartshift powerboost calc for smu13
[ Upstream commit cdf4c8ec39872a61a58d62f19b4db80f0f7bc586 ]

smartshift apu and dgpu power boost are reported as percentage
with respect to their power limits. adjust the units of power before
calculating the percentage of boost.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09 10:22:38 +02:00
Sathishkumar S
c525d3385f drm/amd/pm: update smartshift powerboost calc for smu12
[ Upstream commit 138292f1dc00e7e0724f44769f9da39cf2f3bf0b ]

smartshift apu and dgpu power boost are reported as percentage with
respect to their power limits. This value[0-100] reflects the boost
for the respective device.

Signed-off-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09 10:22:38 +02:00
Evan Quan
ae488dafe0 drm/amd/pm: fix the compile warning
[ Upstream commit 555238d92ac32dbad2d77ad2bafc48d17391990c ]

Fix the compile warning below:
drivers/gpu/drm/amd/amdgpu/../pm/legacy-dpm/kv_dpm.c:1641
kv_get_acp_boot_level() warn: always true condition '(table->entries[i]->clk >= 0) => (0-u32max >= 0)'

Reported-by: kernel test robot <lkp@intel.com>
CC: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09 10:22:34 +02:00
Keita Suzuki
a5ce7051db drm/amd/pm: fix double free in si_parse_power_table()
[ Upstream commit f3fa2becf2fc25b6ac7cf8d8b1a2e4a86b3b72bd ]

In function si_parse_power_table(), array adev->pm.dpm.ps and its member
is allocated. If the allocation of each member fails, the array itself
is freed and returned with an error code. However, the array is later
freed again in si_dpm_fini() function which is called when the function
returns an error.

This leads to potential double free of the array adev->pm.dpm.ps, as
well as leak of its array members, since the members are not freed in
the allocation function and the array is not nulled when freed.
In addition adev->pm.dpm.num_ps, which keeps track of the allocated
array member, is not updated until the member allocation is
successfully finished, this could also lead to either use after free,
or uninitialized variable access in si_dpm_fini().

Fix this by postponing the free of the array until si_dpm_fini() and
increment adev->pm.dpm.num_ps everytime the array member is allocated.

Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09 10:22:33 +02:00
Alex Deucher
0fad10b263 Revert "drm/amd/pm: keep the BACO feature enabled for suspend"
commit a56f445f807b0276fc0660c330bf93a9ea78e8ea upstream.

This reverts commit eaa090538e8d21801c6d5f94590c3799e6a528b5.

Commit ebc002e3ee78 ("drm/amdgpu: don't use BACO for reset in S3")
stops using BACO for reset during suspend, so it's no longer
necessary to leave BACO enabled during suspend.  This fixes
resume from suspend on the navy flounder dGPU in the ASUS ROG
Strix G513QY.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2008
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1982
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18 10:26:57 +02:00
Alex Deucher
b536cf3eb6 drm/amdgpu: don't use BACO for reset in S3
commit ebc002e3ee78409c42156e62e4e27ad1d09c5a75 upstream.

Seems to cause a reboots or hangs on some systems.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1924
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1953
Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-13 20:59:25 +02:00
Alex Deucher
155338be5d drm/amdgpu/smu10: fix SoC/fclk units in auto mode
commit 2f25d8ce09b7ba5d769c132ba3d4eb84a941d2cb upstream.

SMU takes clock limits in Mhz units.  socclk and fclk were
using 10 khz units in some cases.  Switch to Mhz units.
Fixes higher than required SoC clocks.

Fixes: 97cf32996c ("drm/amd/pm: Removed fixed clock in auto mode DPM")
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-13 20:59:24 +02:00
Yiqing Yao
ac98fdec11 drm/amd/pm: enable pm sysfs write for one VF mode
[ Upstream commit e610941c45bad75aa839af015c27d236ab6749e5 ]

[why]
pm sysfs should be writable in one VF mode as is in passthrough

[how]
do not remove write access on pm sysfs if device is in one VF mode

Fixes: 11c9cc95f818 ("amdgpu/pm: Make sysfs pm attributes as read-only for VFs")
Signed-off-by: Yiqing Yao <yiqing.yao@amd.com>
Reviewed-by: Monk Liu <Monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08 14:23:31 +02:00
Tom Rix
b175bc5864 drm/amd/pm: return -ENOTSUPP if there is no get_dpm_ultimate_freq function
[ Upstream commit 430e6a0212b2a0eb1de5e9d47a016fa79edf3978 ]

clang static analysis reports this represenative problem
amdgpu_smu.c:144:18: warning: The left operand of '*' is a garbage value
        return clk_freq * 100;
               ~~~~~~~~ ^

If there is no get_dpm_ultimate_freq function,
smu_get_dpm_freq_range returns success without setting the
output min,max parameters.  So return an -ENOTSUPP error.

Fixes: e5ef784b1e ("drm/amd/powerplay: revise calling chain on retrieving frequency range")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08 14:23:26 +02:00
Evan Quan
4543426cd7 drm/amd/pm: correct UMD pstate clocks for Dimgrey Cavefish and Beige Goby
[ Upstream commit 0136f5844b006e2286f873457c3fcba8c45a3735 ]

Correct the UMD pstate profiling clocks for Dimgrey Cavefish and Beige
Goby.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08 19:12:31 +01:00
Evan Quan
c00e4c01f4 drm/amd/pm: fix some OEM SKU specific stability issues
commit e3f3824874da78db5775a5cb9c0970cd1c6978bc upstream.

Add a quirk in sienna_cichlid_ppt.c to fix some OEM SKU
specific stability issues.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-02 11:47:49 +01:00
Yifan Zhang
3851046599 drm/amd/pm: correct the sequence of sending gpu reset msg
commit 9c4f59ea3f865693150edf0c91d1cc6b451360dd upstream.

the 2nd parameter should be smu msg type rather than asic msg index.

Fixes: 7d38d9dc4e ("drm/amdgpu: add mode2 reset support for yellow carp")
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-23 12:03:08 +01:00
Yang Wang
dcd1c46634 drm/amd/pm: fix hwmon node of power1_label create issue
[ Upstream commit a8b1e8636a3252daa729762b2e3cc9015cc91a5c ]

it will cause hwmon node of power1_label is not created.

v2:
the hwmon node of "power1_label" is always needed for all ASICs.
and the patch will remove ASIC type check for "power1_label".

Fixes: ae07970a06 ("drm/amd/pm: add support for hwmon control of slow and fast PPT limit on vangogh")

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-02-16 12:56:31 +01:00
Evan Quan
4f4c77ad5a drm/amd/pm: correct the MGpuFanBoost support for Beige Goby
commit 3ec5586b4699cfb75cdfa09425e11d121db40773 upstream.

The existing way cannot handle Beige Goby well as a different
PPTable data structure(PPTable_beige_goby_t instead of PPTable_t)
is used there.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-08 18:34:05 +01:00
Marina Nikolic
e4066c05d3 amdgpu/pm: Make sysfs pm attributes as read-only for VFs
[ Upstream commit 11c9cc95f818f0f187e9b579a7f136f532b42445 ]

== Description ==
Setting values of pm attributes through sysfs
should not be allowed in SRIOV mode.
These calls will not be processed by FW anyway,
but error handling on sysfs level should be improved.

== Changes ==
This patch prohibits performing of all set commands
in SRIOV mode on sysfs level.
It offers better error handling as calls that are
not allowed will not be propagated further.

== Test ==
Writing to any sysfs file in passthrough mode will succeed.
Writing to any sysfs file in ONEVF mode will yield error:
"calling process does not have sufficient permission to execute a command".

Signed-off-by: Marina Nikolic <Marina.Nikolic@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-27 11:04:51 +01:00
Evan Quan
b8a1293e38 drm/amd/pm: keep the BACO feature enabled for suspend
commit eaa090538e8d21801c6d5f94590c3799e6a528b5 upstream.

To pair with the workaround which always reset the ASIC in suspend.
Otherwise, the reset which relies on BACO will fail.

Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-11 15:35:19 +01:00
Prike Liang
fbabb82b11 drm/amd/pm: skip setting gfx cgpg in the s0ix suspend-resume
[ Upstream commit 8c45096c60d6ce6341c374636100ed1b2c1c33a1 ]

In the s0ix entry need retain gfx in the gfxoff state,so here need't
set gfx cgpg in the S0ix suspend-resume process. Moreover move the S0ix
check into SMU12 can simplify the code condition check.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-11 15:35:18 +01:00
Lijo Lazar
12a4c1092a drm/amd/pm: Fix xgmi link control on aldebaran
[ Upstream commit 19e66d512e4182a0461530fa3159638e0f55d97e ]

Fix the message argument.
	0: Allow power down
	1: Disallow power down

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-11 15:35:17 +01:00
Mario Limonciello
e3a01c14e8 drm/amd/pm: fix reading SMU FW version from amdgpu_firmware_info on YC
commit dcd10d879a9d1d4e929d374c2f24aba8fac3252b upstream.

This value does not get cached into adev->pm.fw_version during
startup for smu13 like it does for other SMU like smu12.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22 09:32:49 +01:00
Lang Yu
257b3bb166 drm/amd/pm: fix a potential gpu_metrics_table memory leak
[ Upstream commit aa464957f7e660abd554f2546a588f6533720e21 ]

Memory is allocated for gpu_metrics_table in renoir_init_smc_tables(),
but not freed in int smu_v12_0_fini_smc_tables(). Free it!

Fixes: 95868b8576 ("drm/amd/powerplay: add Renoir support for gpu metrics export")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-22 09:32:43 +01:00
Lijo Lazar
c786a7d5b8 drm/amd/pm: Remove artificial freq level on Navi1x
[ Upstream commit be83a5676767c99c2417083c29d42aa1e109a69d ]

Print Navi1x fine grained clocks in a consistent manner with other SOCs.
Don't show aritificial DPM level when the current clock equals min or max.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-08 09:04:39 +01:00
Alex Deucher
832c006eec drm/amdgpu/pm: fix powerplay OD interface
commit d5c7255dc7ff6e1239d794b9c53029d83ced04ca upstream.

The overclocking interface currently appends data to a
string.  Revert back to using sprintf().

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774
Fixes: 6db0c87a0a ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit")
Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-01 09:04:42 +01:00
Evan Quan
21d727a394 drm/amd/pm: avoid duplicate powergate/ungate setting
commit 6ee27ee27ba8b2e725886951ba2d2d87f113bece upstream.

Just bail out if the target IP block is already in the desired
powergate/ungate state. This can avoid some duplicate settings
which sometimes may cause unexpected issues.

Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789
Fixes: bf756fb833 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25 09:49:06 +01:00
Alex Deucher
e16a1e4ba6 drm/amdgpu/powerplay: fix sysfs_emit/sysfs_emit_at handling
[ Upstream commit e9c76719c1e99caf95e70de74170291b9457bbc1 ]

sysfs_emit and sysfs_emit_at requrie a page boundary
aligned buf address. Make them happy!

v2: fix sysfs_emit -> sysfs_emit_at missed conversions

Cc: Lang Yu <lang.yu@amd.com>
Cc: Darren Powell <darren.powell@amd.com>
Fixes: 6db0c87a0a ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774
Reviewed-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:17:10 +01:00
Alex Deucher
1e7edd1834 drm/amdgpu/pm: properly handle sclk for profiling modes on vangogh
[ Upstream commit 68e3871dcd6e547f6c47454492bc452356cb9eac ]

When selecting between levels in the force performance levels interface
sclk (gfxclk) was not set correctly for all levels.  Select the proper
sclk settings for all levels.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1726
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:17 +01:00
Lijo Lazar
ab39d3cef5 drm/amd/pm: Update intermediate power state for SI
Update the current state as boot state during dpm initialization.
During the subsequent initialization, set_power_state gets called to
transition to the final power state. set_power_state refers to values
from the current state and without current state populated, it could
result in NULL pointer dereference.

For ex: on platforms where PCI speed change is supported through ACPI
ATCS method, the link speed of current state needs to be queried before
deciding on changing to final power state's link speed. The logic to query
ATCS-support was broken on certain platforms. The issue became visible
when broken ATCS-support logic got fixed with commit
f9b7f3703f ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)").

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1698

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-23 17:06:11 -04:00
Evan Quan
8b514e898e drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver
Current RUNPM mechanism relies on PMFW to master the timing for BACO
in/exit. And that needs cooperation from sound driver for dstate
change notification for function 1(audio). Otherwise(on sound driver
missing), BACO cannot be kicked in correctly and hang will be observed
on RUNPM exit.

By switching back to legacy message way on sound driver missing,
we are able to fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-and-tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-16 09:56:24 -04:00
Lang Yu
abd0a16ac7 drm/amdgpu: add manual sclk/vddc setting support for cyan skilfish(v3)
Add manual sclk/vddc setting supoort via pp_od_clk_voltage sysfs
to maintain consistency with other asics. As cyan skillfish doesn't
support DPM, there is only a single frequency and voltage to adjust.

v2: maintain consistency and add command guide.
v3: adjust user settings storage and coding style.

Command guide:
echo vc point sclk vddc > pp_od_clk_voltage
	"vc"    - sclk voltage curve
	"point" - must be 0
	"sclk"  - target value of sclk(MHz), should be in safe range
	"vddc"  - target value of vddc(mV), a 6.25(mV) stepping is
		  recommended and should be in safe range (the real
		  vddc is an approximation of target value)
echo c > pp_od_clk_voltage
	"c"	- commit the changes of sclk and vddc, only after
		  the commit command, the target values set by "vc"
		  command will take effect
echo r > pp_od_clk_voltage
	"r" 	- reset sclk and vddc to default value, a subsequent
		  commit command is needed to take effect

Example:
1) Check default sclk and vddc
	$ cat pp_od_clk_voltage
	OD_SCLK:
	0: 1800Mhz *
	OD_VDDC:
	0: 862mV *
	OD_RANGE:
	SCLK:    1000Mhz       2000Mhz
	VDDC:     700mV        1129mV
2) Set sclk to 1500MHz and vddc to 700mV
	$ echo vc 0 1500 700 > pp_od_clk_voltage
	$ echo c > pp_od_clk_voltage
	$ cat pp_od_clk_voltage
	OD_SCLK:
	0: 1500Mhz *
	OD_VDDC:
	0: 693mV *
	OD_RANGE:
	SCLK:    1000Mhz       2000Mhz
	VDDC:     700mV        1129mV
3) Reset sclk and vddc to default
	$ echo r > pp_od_clk_voltage
	$ echo c > pp_od_clk_voltage
	$ cat pp_od_clk_voltage
	OD_SCLK:
	0: 1800Mhz *
	OD_VDDC:
	0: 874mV *
	OD_RANGE:
	SCLK:    1000Mhz       2000Mhz
	VDDC:     700mV        1129mV
NOTE:
We don't specify an explicit safe range, you can set any values
between min and max at your own risk. Enjoy!

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:23 -04:00
Lang Yu
3061fe937e drm/amdgpu: add some pptable funcs for cyan skilfish(v3)
Add print_clk_levels and read_sensor pptable funcs for
cyan skilfish.

v2: keep consitency and add get_gpu_metrics callback.
v3: use sysfs_emit_at() in sysfs show function.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:23 -04:00
Lang Yu
c007e17c84 drm/amdgpu: update SMU driver interface for cyan skilfish(v3)
Add SmuMetrics_t definition for cyan skilfish.

v2: update SmuMetrics_t definition.
v3: cleanup and rearrange the order of fields.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:23 -04:00
Lang Yu
8492d3a07d drm/amdgpu: update SMU PPSMC for cyan skilfish
Add some PPSMC MSGs for cyan skilfish.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:23 -04:00
Lang Yu
8f48ba303d drm/amdgpu: fix sysfs_emit/sysfs_emit_at warnings(v2)
sysfs_emit and sysfs_emit_at requrie a page boundary
aligned buf address. Make them happy!

v2: use an inline function.

Warning Log:
[  492.545174] invalid sysfs_emit_at: buf:00000000f19bdfde at:0
[  492.546416] WARNING: CPU: 7 PID: 1304 at fs/sysfs/file.c:765 sysfs_emit_at+0x4a/0xa0
[  492.654805] Call Trace:
[  492.655353]  ? smu_cmn_get_metrics_table+0x40/0x50 [amdgpu]
[  492.656780]  vangogh_print_clk_levels+0x369/0x410 [amdgpu]
[  492.658245]  vangogh_common_print_clk_levels+0x77/0x80 [amdgpu]
[  492.659733]  ? preempt_schedule_common+0x18/0x30
[  492.660713]  smu_print_ppclk_levels+0x65/0x90 [amdgpu]
[  492.662107]  amdgpu_get_pp_od_clk_voltage+0x13d/0x190 [amdgpu]
[  492.663620]  dev_attr_show+0x1d/0x40

Signed-off-by: Lang Yu <lang.yu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-09-16 09:56:23 -04:00
Kenneth Feng
5598d7c21a drm/amd/pm: fix the issue of uploading powerplay table
fix the issue of uploading powerplay table due to the dependancy of rlc.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-09-14 16:15:21 -04:00
Jiawei Gu
7884d0e9e3 drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode
Enable pp_num_states, pp_cur_state, pp_force_state, pp_table sysfs under
SRIOV 1-VF scenario.

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-31 14:20:41 -04:00
Colin Ian King
f5d8e16488 drm/amdgpu/swsmu: fix spelling mistake "minimun" -> "minimum"
There are three identical spelling mistakes in dev_err messages.
Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 15:09:42 -04:00
Koba Ko
b3dc549986 drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform
Due to high latency in PCIE clock switching on RKL platforms,
switching the PCIE clock dynamically at runtime can lead to HDMI/DP
audio problems. On newer asics this is handled in the SMU firmware.
For SMU7-based asics, disable PCIE clock switching to avoid the issue.

AMD provide a parameter to disable PICE_DPM.

modprobe amdgpu ppfeaturemask=0xfff7bffb

It's better to contorl PCIE_DPM in amd gpu driver,
switch PCI_DPM by determining intel RKL platform for SMU7-based asics.

Fixes: 1a31474cdb ("drm/amd/pm: workaround for audio noise issue")
Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
Signed-off-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 15:09:27 -04:00
Kees Cook
4a9bd6db19 drm/amd/pm: And destination bounds checking to struct copy
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

The "Board Parameters" members of the structs:
	struct atom_smc_dpm_info_v4_5
	struct atom_smc_dpm_info_v4_6
	struct atom_smc_dpm_info_v4_7
	struct atom_smc_dpm_info_v4_10
are written to the corresponding members of the corresponding PPTable_t
variables, but they lack destination size bounds checking, which means
the compiler cannot verify at compile time that this is an intended and
safe memcpy().

Since the header files are effectively immutable[1] and a struct_group()
cannot be used, nor a common struct referenced by both sides of the
memcpy() arguments, add a new helper, amdgpu_memcpy_trailing(), to
perform the bounds checking at compile time. Replace the open-coded
memcpy()s with amdgpu_memcpy_trailing() which includes enough context
for the bounds checking.

"objdump -d" shows no object code changes.

[1] https://lore.kernel.org/lkml/e56aad3c-a06f-da07-f491-a894a570d78f@amd.com

Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Feifei Xu <Feifei.Xu@amd.com>
Cc: Likun Gao <Likun.Gao@amd.com>
Cc: Jiawei Gu <Jiawei.Gu@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30 14:59:34 -04:00
Evan Quan
8ac1696b1d drm/amd/pm: a quick fix for "divided by zero" error
Considering Arcturus is a dedicated ASIC for computing, it
will be more proper to drop the support for fan speed reading
and setting. That's on the TODO list.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: Rui Teng <rui.teng@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20 12:06:14 -04:00
Colin Ian King
c94126c4aa drm/amd/pm: Fix spelling mistake "firwmare" -> "firmware"
There is a spelling mistake in a dev_err error message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18 18:26:10 -04:00
Evan Quan
b64625a303 drm/amd/pm: correct the address of Arcturus fan related registers
These registers have different address from other SMU V11 ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00
Evan Quan
bc08cab690 drm/amd/pm: drop unnecessary manual mode check
As the fan control was guarded under manual mode before fan speed
RPM/PWM setting. Thus the extra check is totally redundant.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00
Evan Quan
0d8318e112 drm/amd/pm: drop the unnecessary intermediate percent-based transition
Currently, the readout of fan speed pwm is transited into percent-based
and then pwm-based. However, the transition into percent-based is totally
unnecessary and make the final output less accurate.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00
Evan Quan
d9ca7567b8 drm/amd/pm: correct the fan speed RPM retrieving
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed
PWM and RPM is not true for SMU11 ASICs. So, we need a new way to
retrieving the fan speed RPM.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00
Evan Quan
fb1f667e71 drm/amd/pm: correct the fan speed PWM retrieving
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed
PWM and RPM is not true for SMU11 ASICs. So, we need a new way to
retrieving the fan speed PWM.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00
Evan Quan
96401f7c21 drm/amd/pm: record the RPM and PWM based fan speed settings
As the relationship "PWM = RPM / smu->fan_max_rpm" between fan speed
PWM and RPM is not true for SMU11 ASICs. So, both the RPM and PWM
settings need to be saved.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00
Evan Quan
f3289d0497 drm/amd/pm: correct the fan speed RPM setting
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed
PWM and RPM is not true for SMU11 ASICs. So, we need a new way to
perform the fan speed RPM setting.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16 15:35:56 -04:00